Apache speed and reverse proxies

In our environment we use Apache everywhere. It's PHP integration has so far proven superiour. Now we're dealing with higher loads and we've hit some limitations.

One of the problems we had, is Apache's heaviness. Our apache2 worker processes eat up around 20 Megabytes of memory, and with 3 GB of memory will bring us up to a setting of around 150 MaxClients. Rasmus seems to think that's a pretty high setting, but based off the easy calculation (memory available for apache / size of an apache process) it works out for us.

Effectively this means we can serve approximately this much parallel request on this machine. It is therefore in our greatest benefit to get every response out as quickly as possible, increasing the amount of requests we can handle per second.

Going beyond this 150 number could cause Linux to start using swap. This is bad, because it will add latency to the response, which in turn will result connections staying open longer.

Since we're sending everything over the web, there is a standard latency. Information traveling to the other side of the globe will at least take 67ms because we're restricted to the speed of light. This doesn't even take non-direct routes nor other hardware latency into account. According to Till this all adds up to the time a single Apache process takes up before working on the next request.

The reverse proxy

There are a couple of webservers which seem to be optimized for serving lots of clients. Lighttpd got a lot of traction earlier, but the project seems to have slowed down a lot as the much anticipated 1.5 release has been under development for almost 2 years. nginx seems have taken it's place in terms of disruptiveness. These servers are much more lightweight, and are supposed to be faster in delivery of static files.

Much like Till, we've had issues hooking PHP directly into these servers. Till suggests the solution of actually placing nginx in front of Apache (on the same machine) as a reverse proxy. Nginx takes care of serving static files and proxies any PHP request to Apache. The concept is that Apache can push out the response as quickly as possible, and while Nginx is working on delivering it to the (slow) client Apache can take on other work.

The thing that bothers me with this setup, is that the need for 2 webserver products to achieve a single task. This implies that neither of them is adequate on it's own to do the job.

On the other hand, this type of setup is also what a lot of people seem to be doing by placing Squid in front of their webservers, although that tends to happen on separate hardware.

HTTP/1.1 100 Continue

All of a sudden we noticed a problem we saw earlier with Lighttpd (Bug #1017) was also an issue in nginx (couldn't find bug or bug tracker at all). Neither of them seems to support the Expect: 100-continue header. While no browser actually sends these headers, we have webservices running which are directly accessed by other types of HTTP clients. Losing support for this HTTP functionality would instantly break their applications, which is unacceptable.

So now we're actually looking at Squid for performing that task. Squid is powerful and well tested. We're going to start load testing this reasonably soon, and I have no problems reporting back here if people are interested in numbers. I'm wondering if there's other people who have tried a similar setup or if there's better ways to approach this problem.