Lighttpd + PHP fastcgi woes
In trying to get more out of our webservers using a Lighttpd and PHP-FastCGI setup, I've come across some major issues that make it difficult to use. I hope this post will warn people of some of the bugs they might encounter and workaround that might need to be implemented until some of these are fixed.
First off, the parent PHP-CGI process spawns n number of children, depending on your PHP_FCGI_CHILDREN. However, if your webserver (lighttpd) is stopped, or restarted, the parent process does not kill its children and they all get orphaned.
The only way to get around this easily is by making sure that as soon as you need to stop or restart your webserver, you do a 'killall php-cgi' while the server is still down. There's a PHP bugreport open, which seems to indicate the issue also happens in Apache. Vote for it!
The second, more severe issue is that when you hit maximum capacity for your PHP backend, lighty will start serving HTTP 500 errors for all PHP requests, and does not seem to stop doing this until the webserver is restarted altogether. Although not completely sure, these bugs seem to be the relevant ones:
So yea, based on this information it turns out that there's a clear need for some smart process killing/webserver restarting scripts if you'd like to switch to lighttpd in a high load environment. I got pretty scared trying this after finding these bugs. Makes me think no one really tried it out under heavy loads, which leads me to some questions.. Hopefully some readers of this feed have some experience here.
- Are you using lighttpd or an other alternative 'light' webserver using PHP under high load environments? Have you experienced similar issues?
- What are good ways around PHP's FCGI buggy behaviour (the buggy part is that PHP's parent process should return FCGI_OVERLOADED instead of timing out.) Should FCGI be avoided altogether at this point?
- What is a good way to come up with settings for 'max-procs' and 'PHP_FCGI_CHILDREN'. Reading other people's comment on this on the web people are all across the board, ranging from 1 for max-procs and 200 for PHP_FCGI_CHILDREN, to the exact opposite. Supposedly APC is isolated to 1 group of processes, so getting at least bigger groups of processes is important.
- And most importantly, whats your moms favourite color?
Comments
kumi •
For the orphaned PHP procs problem, maybe you should try setting "kill-signal" => 10, as someone mentioned in the lighttpd bug tracker. However, I haven't tried this setting and my webserver doesn't have any problem with orphaned PHP processes.We average 60 req/s, with short-lived high load events every day (our MySQL backend is responsible). During these times, server-status shows a very high load on all of our backends and lighttpd returns 500 errors, but when the load event clears, everything is back to normal.
I used to have "max-procs" => 1, but found that the 500 errors were occurring too frequently. I have increased that value to 4, and the situation seems to be better now. We use xcache and there is no problem with multiple PHP parents (only an increase in memory usage.)
Chris •
Try using XCache. It was the final change that allowed us to attain stability with PHP + FastCGI + Lighttpd. We previously ran APC.I too have played a lot with the children/max-procs setting. The docs on Lighttpd suggest using 1 if you are using an OP Code cache. I find that this isn't good advice, as it is less reliable, because once the one back-end you have gets overloaded, it's game over and you need to stop/kill all the cgi processes and restart the server. We seem to be ok running 4 with XCache.
Here is our current config, we serve 200k unique visitors a day, our average page generation time is around .03.
fastcgi.server = ( ".php" =>
(( "socket" => "/tmp/php-fastcgi.socket" + var.PID,
"bin-path" => "/opt/php5/bin/php-cgi",
"idle-timeout" => 20,
"max-procs" => 4,
"bin-environment" => (
"PHP_FCGI_CHILDREN" => "8",
"PHP_FCGI_MAX_REQUESTS" => "500"
)
))
)
Matthijs •
We have hit these FastCGI problems in Lighttpd too often and too hard. Even with XCache.We tried to support the Lighttpd community to fix these problems, but there is so little going on, that we're now switching to Nginx. Without any regrets I must say.
I can also recommend looking into PHP-FPM to replace Lighty's spawn-fcgi.
Matthijs •
Oh and I forgot to say that we serve more than 90M pageviews / 5M uniques per day. ;-)If you'd like to know more, drop me a line at my e-mail address.
Evert •
Hey Matthijs,I'm definitely going to check out Nginx, Its not the first time I've heard discontent regarding Lighttpd's team responsiveness and slow release schedules; perhaps Nginx..
I'd like to check out PHP-FPM, but seems like all their documentation is in russian.. Makes me a bit more unsure if its a good choice or what I would use it for :)
Thanks for all the hints guys, glad its not just me :)
Evert
Chris •
I have the same feeling regarding Lighttpd development. It was looking very promising for awhile, now they are lost and going in three different directions, there is little support for 1.4, 1.5 is dead and who knows when 2 will be out.If I were to setup a new webserver, I would definitely give Nginx a try. I would switch us now, but we have achieved stability/performance with Lighttpd for now.
Jani •
I haven't had any problems with PHP FastCGI + APC + lighttpd 1.4.19 with this config:# Event handler
server.event-handler = "linux-sysepoll"
fastcgi.server = (
".php" => ((
"bin-path" => "/opt/php/bin/php-cgi",
"socket" => "/tmp/php.socket",
"max-procs" => 1,
"bin-environment" => (
"PHP_FCGI_CHILDREN" => "32",
"PHP_FCGI_MAX_REQUESTS" => "10000"
),
"bin-copy-environment" => (
"PATH", "SHELL", "USER"
),
"broken-scriptfilename" => "enable"
))
)
I also use mod_magnet quite a lot which makes my life very easy:
no messy rewrite stuff. :)
But the server isn't really in very heavy use (yet) and hosts 2 websites.
This might change next week though...hopefully I won't see any of the problems described here. :D
Lukas •
All the PHP bug tickets that you linked to are closed with "no feedback". So how about providing some feedback?Evert •
I definitely will as soon as I have time to do this, I put an investment in that setup, so I'd like to make it work..My experimentation were with stable debian packages, so the intentions if this post was really to highlight issues with similar setups, which I know there are quite a bit of.
I should emphasize on my setup and which versions of the tools I use.
My Setup:
* Debian 4.0 Etch (updated regularly)
* PHP 5.2.0-8+etch11
* lighttpd-1.4.13
till •
I've had those issues too. In most cases lighttpd and PHP just lost track of each other and I had to restart both in order to make it work. Lighttpd crashed rarely, and neither did the fcgi processes - they just could not communicate.I'm not an expert when it comes lighttpd, but a lot of people claim that the problems are SMP related. And we tried it on an SMP system (FreeBSD 5.x, 6.x).
Also, define 'high traffic'. From a short intermezzo with lighttpd we went back to Apache 1.3 and mod_php. We can handle over 1000 requests per server. The O'Reilly book on Apache 1.3 has served me pretty well over the years.
Evert •
The 500 internal errors were during moments we actually had database load issues, which caused the webserver requests to queue up and never get out of 'overloaded state'.So we're really talking about maxing out the concurrent PHP processes, which is currently set to 200.
till •
At 200, that's a beating. :(I can share my Apache 1.3 configs if you need, ping me via email.
Jani •
We peaked at 67 req/s today. No problems, I had changed above config to this earlier though:"max-procs" => 2,
"bin-environment" => (
"PHP_FCGI_CHILDREN" => "16",
"PHP_FCGI_MAX_REQUESTS" => "1000"
The one with only one parent process and 32 childs worked too but due to some other issues ended with some of those overloaded childs.
Hint: APC + __autoload() == EVIL! :)
Evert •
67/s is pretty good.. I've yet to get that here.. I'm pretty shocked about your low number of processes.. Only 32!Jani •
Just have to add: I noticed that most of the requests are handled by the "2nd" parent process' childs:fastcgi.active-requests: 0
fastcgi.backend.0.0.connected: 41093
fastcgi.backend.0.0.died: 0
fastcgi.backend.0.0.disabled: 0
fastcgi.backend.0.0.load: 0
fastcgi.backend.0.0.overloaded: 0
fastcgi.backend.0.1.connected: 131654
fastcgi.backend.0.1.died: 0
fastcgi.backend.0.1.disabled: 0
fastcgi.backend.0.1.load: 0
fastcgi.backend.0.1.overloaded: 0
fastcgi.backend.0.load: 1
fastcgi.requests: 172747
So in theory this server can actually handle a lot more req/s than the peek of 67/s (average was around 40req/s).
Rubén Ortiz •
HiI have had the same issues with lighttpd and php fast-cgi. Currently, I only use lighttpd as a static file webserver, everybody knows that has less resources comsuption than Apache, so in this case the feeling is good.
But when I try to start a new environment with a php fast-cgi, I only have a problems. First I tried on Windows platform. Was a hell. Then I tried on a Linux, and if the result was better, I can't use lighttpd as a solid and stable dinamic content webserver yet.
I read some interesting advices here, Im going to try it. Great post.
Bye
Jani •
Update: Got a peak record last week, over100req/s. And no noticeable effects. No crashes. I had earlier tuned Mysql a bit..had thread_cache = 0 (which is default) and had to tune that a bit but otherwise the same config as earlier. Proves my assumption it can take a beating easily. :)Fred Peeterman •
I have noticed a few issues. First of all does the php fast-cgi backend tend to run into problems when PHP_FCGI_MAX_REQUESTS is set too high. From the various postings I gather 1000 seems a good value.Then upon restart and reload the fast-cgi children become detached. It seems that lighttpd terminates before the fast-cgi processes do. Upon start lighttpd finds the still running fast-cgi processes and looses control over them. To overcome the problem I added sleep 0.5 at the end of the stop routine in my start-stop script. I have not found a cure for the reload scenario though. Suggestions are welcome ;-) For now I’ve changed logrotate to do a restart instead.
Also may I suggest the vim-syntax-lighttpd package from the friendly folks from pld-linux.
Evert •
Having a high number for max_requests like 1000 could kill your server under high load scenario's.If you run out of memory, your server might use swap, which could trigger some nasty chain reactions.
Evert •
Disregard my last comment, I confused MAX_REQUESTS with CHILDREN.Thomas •
Hello i m facing a problem where my webserver process gets dead when i enable php within lighttpd using configuration below.fastcgi.server = ( ".php" =>
( "localhost" =>
(
# "socket" => "/var/run/lighttpd/php-fastcgi.socket",
"socket" => "/tmp/php-fastcgi.socket",
"bin-path" => "/usr/bin/php-cgi"
)
)
)
The process goes dead and subsys gets locked.
Please help.