Alexey Zakhlestin's Blog

Programming for Mac and Web

FastCGI in PHP. The Way It Could Be

Permalink

Intro

Most PHP programmers believe, that PHP has support for FastCGI. They refer to fastcgi-sapi, which is bundled with php since long ago, and which was recently reimplemented for PHP 5.1.3/4. This SAPI really does exist and actually working quite good. But… it is not a real fast-cgi. It is just an imitation of mod_php which is linked against fastcgi api, instead of apache api. So, it’s time for you to ask: if it exists and works, then what am I talking here about? Let’s start from the basics..

History

Once upon a time, in a galaxy far-far away, … hmm…

CGI.

The age of server-side web programming, as we know it, started with the introduction of Common Gateway Interface (CGI) in 1993. It allows http-server to launch external application, which gets raw HTTP-input and produces some output. This was a strictly UNIX-ish way of doing things, because CGI is basically a pipe between the server, which knows how to deal with TCP/IP and an application which just knows how to deal with stdin and stdout. CGI-app can be written in any programming language, doesn’t require any specific linking or compile-options. The advantage of such simplicity becomes the disadvantage, when your web-app starts to get a lot of hits. Every request ends in creating another process, and creating process can be quite an expensive operation.

pro: web-server independent

contra: slow

This problem received 2 independent solutions. The first one was creating server-dependent APIs for embedding applications directly into web-server process, and the second one was FastCGI.

Server APIs.

Embedding application into web-server is really good speed-wise, as the process initialization happens only once, and all of the requests are served in preloaded environment. No separate processes are started for serving requests, everything stays inside. This is good, and this is bad, at the same time. Think “security”, think “scalability”. Running your application inside of web-server means, that at the base-level it will have all the rights, which web-server itself has, which means that at the virtual-hosting any user can have access to any other users file (which include passwords and any other sensitive information). At the same time, there is no easy way to scale application to several machines, if there is a need to do so. Scalability issue can be solved by proxy’ing, but that is not the most beautiful solution, imho. (Mark Mayo has the different opinion on this subject).

The most well-known example of such embedding is Apache’s mod_php. It is faster than CGI, it runs with apache’s privileges, it has to be proxied if you need scaalbility.

pro: fast (lack of startup overhead)

contra: not secure, scalability requires additional tools, incompatible between server-manufacturers

FastCGI

Appeared in 1996, FastCGI is a solution which was trying to overcome problems of CGI, while keeping it’s good sides. Basically, it is a wrapper around CGI-application, which adds it a runloop. Application can make some initialization before entering runloop and can do some cleanup in the end. Web-server uses either UNIX or TCP-sockets to communicate with fastcgi-application. This gives us possibility to run target application under any user-privileges we need, without compromising other users privacy, to move application to the separate machine (or even several separate machines), and still to have very nice request-times.

FastCGI is also very nice speed-wise for application, which tend to have complex initialization process. Just do it before the runloop and leave only the actual serving logic inside.

typical fastcgi-application way of doing stuff (written in pseudo-code):

    load_all_modules();
    init_cache();
    init_database();

    while (wait_for_next_request()) {
        if (time_to_break())
            break;

        parse_request();
        serve_reply();
    }

    shut_down();

pro: fast (lack of startup overhead), secure, scalable, web-server independent

contra: requires a little-bit more programming than CGI

Getting back to PHP

Now, when you have all the required background let’s talk about mod_php and the way that it works. mod_php is an implementation of PHP for Apache web-server and is the most used environment for running PHP-scripts. The main idea behind it is simplicity for the final user. Each script is run from the beginning, when apache receives the request, allocates and uses any resources it needs in process and just forgets about those in the end — it’s mod_php’s tasks to do the cleanup. This means, that every request is completely separated (“sandboxed”) from all the other requests and it is not possible by standard means to have any data to be persistent between requests. This is reqlly good if you need something simple to be running easily and right now, but becomes a bottle-neck when you make “enterprise” application, which becomes slow just because it is big.

At some point, fastcgi implementation appeared in PHP’s list of SAPI’s, but the concept behind it was just to emulate mod_php’s behavior and make it consistent over web-servers which are not apache. PHP’s fastcgi SAPI doesn’t expose runloop to the PHP application, but implements it inside of itself, instead. As a result, we still have “everything is cleaned up on exit” mode, which leaves us without possibility to pre-init anything. What really bugs me a lot is that all of the php-competitors have proper fast-cgi support: Perl, Python, Ruby. This is one of the reasons, by the way, why Ruby’s “Rails” and Python’s “Django” are working faster than similiar PHP’s frameworks.

The main reason for the current situation is, that PHP’s main aim always was “simplicity” and strict orientation on the small web-apps. Perl, Python and Ruby and general-purpose languages which came into the web-world and are looking for the best ways to fit here (so they chose fastcgi), while PHP originated as a web-language and has a big legacy and traditions of such. Proper fastcgi support in the beginning would have slowed PHP adoption (as it would add complications). PHP is just making it first steps as a general-purpose language and as a language for “big apps”. This time, FastCGI is required.

I am in the mood, actually, to implement the proper FastCGI-SAPI and FastCGI-extension for PHP, but I have really big troubles with allocating time for this project. If anyone is interested in doing some coding in this direction I would be just happy to cooperate.

Some links:

Love and Hate and Everything…

Permalink

I love APC. Everyone who read George Schlossnagle’s “Advanced PHP Programming” loves APC. I love it, because it makes “enterprise” sized applications run twice faster. I love it because it gives me cheap and convenient shared memory cache, so my app runs even faster. I love it because it will be a part of PHP6, which gives me hope, that it won’t be abandoned, and, probably, will be even faster some day. The best thing about it, is that you don’t need to make any changes to your code. You just enable/install php-extension and instantly getting 50-200% increase of execution time.

This moment, you are probably typing “pecl install APC” on your test-servers, running first benchmarks and drawing presentation-diagrams for your boss. But wait just a moment before sending her email, because…

…well, oh God, I hate APC. I hate it because of a little-little small bug, which was reported so long ago, and is still not fixed. This bug just makes APC close to unusable for “enterprise” sized applications (which, as I already mentioned, are getting the biggest speed-gain from APC). The worst thing about this bug, that it is possible to not mention it for quite some time. Some parts of web-app will continue to work (actually, they will work very fast), but some other pages will leave you with a strange message in the error-log about missing the parent class at the point in your code, where you are 100% sure to have it defined.

And then you know, that you are turning APC of at this exact minute, because you don’t want your boss to see that error. And you put all those nice diagrams in the furthest folder and start to hate APC for ruining a beautiful dream you just had.

FIN.

I hope it is not

George, Rasmus, anyone — please fix APC’s bug #5314 as soon as it is possible. Let our dream come true.

p.s. I saw Rasmus at PHP Conference in Moscow on 25-26 of may and he just repeated me, that fixing the bug “will take some time”.