[Catalyst] Scalable Catalyst
Tomas Doran
bobtfish at bobtfish.net
Thu Apr 30 12:31:27 GMT 2009
Alejandro Imass wrote:
> Anyway, the message is that with mod_worker/mod_perl you can spawn
> _thousands_ of threads, getting impressive concurrency (without
> counting the mutex). We have tested Catalyst applications that handle
> _thousands_ of concurrent requests using off the shelf AMD 64Bit HW
> and 12Gb RAM, with a Catalyst app of about 20MB RSS.
There is a big difference between having thousands of requests in-flight
at once, and serving thousands of new requests a second.
You're saying that mod_worker can do the former well, without mentioning
the latter.
I'd very much guess that in your configuration, most of your workers
(and requests) are just pushing bytes to the user, which isn't really a
hard job.. :_)
The reason that normal mod_perl fails at this is you have one process
per request, and so having many many requests in flight at once hurts.
However, if you have thousands of requests all trying to generate pages
at once, you're going to fail and die - full stop...
perl -e'system("perl -e\"while (1) {}\" \&") for (1..1000)'
will convince you of this if you aren't already :)
You can trivially get round this by having a _small_ number of mod_perl
processes behind a proxy, so that your (expensive/large) mod_perl
process generates a page, then throws it at network speed (1Gb/s or
higher if you're on localhost) to the proxy, which then streams it to
the user much much slower. This frees up your mod_perl processes as
quickly as possible to be getting on with useful work.
I'd also note that having more threads/processes generating pages than
you have CPU cores is fairly inefficient, as the more processes you
have, the greater the penalty you're going to incur due to increased
context switching overhead. (
Quite often you block on the database in most apps, which means that 1
process per CPU core doesn't hold totally true for best throughput, so
YMMV..
For the record, one of my apps can trivially do 200 requests a second,
with 3000+ concurrent requests in-flight, using a single 4Gb dual core
x64 box with one disk, running both the application _and_ the mysql server..
It flattens the 100Mb pipe to the internet I have in the office waaay
before the system actually starts to struggle from a load perspective..
That's nginx / fastcgi with 3 fcgi worker processes (no of cores +1) -
when benchmarking I found this most efficient for that application.
This is one of the things that your mileage varies significantly,
depending on what your application is doing, and anyone else's answer is
going to be lies - you _need_ to test and optimise it yourself for your
app and your workload. :)
Cheers
t0m
More information about the Catalyst
mailing list