Lighttpd + PHP fastcgi woes

In trying to get more out of our webservers using a Lighttpd and PHP-FastCGI setup, I've come across some major issues that make it difficult to use. I hope this post will warn people of some of the bugs they might encounter and workaround that might need to be implemented until some of these are fixed.

First off, the parent PHP-CGI process spawns n number of children, depending on your PHP_FCGI_CHILDREN. However, if your webserver (lighttpd) is stopped, or restarted, the parent process does not kill its children and they all get orphaned.

The only way to get around this easily is by making sure that as soon as you need to stop or restart your webserver, you do a 'killall php-cgi' while the server is still down. There's a PHP bugreport open, which seems to indicate the issue also happens in Apache. Vote for it!

The second, more severe issue is that when you hit maximum capacity for your PHP backend, lighty will start serving HTTP 500 errors for all PHP requests, and does not seem to stop doing this until the webserver is restarted altogether. Although not completely sure, these bugs seem to be the relevant ones:

So yea, based on this information it turns out that there's a clear need for some smart process killing/webserver restarting scripts if you'd like to switch to lighttpd in a high load environment. I got pretty scared trying this after finding these bugs. Makes me think no one really tried it out under heavy loads, which leads me to some questions.. Hopefully some readers of this feed have some experience here.

  1. Are you using lighttpd or an other alternative 'light' webserver using PHP under high load environments? Have you experienced similar issues?
  2. What are good ways around PHP's FCGI buggy behaviour (the buggy part is that PHP's parent process should return FCGI_OVERLOADED instead of timing out.) Should FCGI be avoided altogether at this point?
  3. What is a good way to come up with settings for 'max-procs' and 'PHP_FCGI_CHILDREN'. Reading other people's comment on this on the web people are all across the board, ranging from 1 for max-procs and 200 for PHP_FCGI_CHILDREN, to the exact opposite. Supposedly APC is isolated to 1 group of processes, so getting at least bigger groups of processes is important.
  4. And most importantly, whats your moms favourite color?