[Catalyst] FW: Clustering catalyst apps

Dave C skinnydill at gmail.com
Mon May 8 22:14:14 CEST 2006

On 5/8/06, Gert Burger <gburger at mweb.co.za> wrote:
> Thanks for the reply, here are some of my comments on this:

Disclaimer: I work for a large hosting company (shameless:
http://www.hostway.com) and I specialize in designing highly available
clusters for large customers using all Open Source, freely available
software running on both (depending on the customer) "crappy" and
non-crappy systems (we host parts of foxnews.com, orbitz, Wikipedia,
and others).

The key to offer the "five nines" availabilty (99.999%, or under 5
minutes a year) is to examine faults in every aspect, including
application, hardware, network, facility, and OS to identify single
points of failure.  Then, just design around them.  Even down to such
details as plugging servers into different power strips on separate
phases (may seem obvious, but you'd be suprised what I've seen bring a
cluster down), and using IP addresses located on different subnets,

On a larger scale, we happen to offer a global caching platform
similar to Akamai built on pure Open Source software which will route
around an entire data center going offline (we have ten different data

> Using round robin dns still means that if 50% of the servers are down,
> 50% of all queries will goto the broken machines. Which will piss of
> half your customers.

Not necessarily.  Both google.com and yahoo.com use RR DNS:

host www.google.com
www.google.com is an alias for www.l.google.com.
www.l.google.com has address
www.l.google.com has address
www.l.google.com has address

host www.yahoo.com
www.yahoo.com is an alias for www.yahoo.akadns.net.
www.yahoo.akadns.net has address
www.yahoo.akadns.net has address
www.yahoo.akadns.net has address
www.yahoo.akadns.net has address
www.yahoo.akadns.net has address
www.yahoo.akadns.net has address
www.yahoo.akadns.net has address
www.yahoo.akadns.net has address

However, they lower the TTL on the records to under 60 seconds, which
allows for changes to be made quickly.   Using monitoring software
like nagios, monit, or your own using Test::WWW::Mechanize::Catalyst,
one could connect to the application on each alias and if there is an
error, yank that IP from DNS.

> Anycase, back to my issue, How do websites like slashdot and amazon, all
> which use perl, keep uptimes of close to 99.999% ?

They use multiple layers of redundancy.  As I outlined above, the
first point would be RR DNS, then, each of the IPs returned are
connected to some sort of load balancer (hardware possibly using
BigIP, Foundry, or Cisco gear, software using LVS).  There's some
reverse proxying being done, connecting to query caches for database
intensive work, then returning the request back to the client.

For a good outline of how LiveJournal uses open source software for
high availablity, check

> And is it possible to get to that level with lots of crappy hardware?

Yes, Google actually designs around this.  They don't even use
hardware RAID in their systems and are said to use commodity equipment
costing roughly $1000/piece. 


More information about the Catalyst mailing list