Networks can be configured to be so incredibly redundant now - for reasonable prices - that there is no excuse for a data center not to achieve five nines (99.999%) of availability.
But what about the servers and applications? Why spend so much time up front configuring the network to make sure it doesn't fail, and then deploy an application to a single server?
Sure, there are ways to make sure individual servers have some redundancy to minimize failures -- things like RAID1, RAID5, or RAID10 (redundant array of inexpensive disks) which will protect against a disk drive failure (and I highly recommend this type of configuration for all production servers - and preferably the use of hardware RAID vs. software RAID). But what happens if a file gets corrupt on the RAID array? Or a recent configuration change brings the application down? Or a newly released patch conflicts with other settings and causes problems? Well, in these situations the server will go down and the application(s) hosted on that server will be offline.
A good monitoring and alerting process will allow the system administrator to detect and address these issues quickly, but still there will be some level of downtime associated with the issue. And depending on the type of issue, even the best system administrator might not be able to immediately resolve the issue - it may take time. Time during which your application is unavailable and you may be losing business due to the site interruption.
So, what can you do?
A great option - and one that has recently become more affordable - is to host your application on a webfarm. A webfarm consists of two or more web servers with the same configuration, and that serve up the same content. There are special switches and processes involved that allow each of these servers to respond to a request to a single location. For example, say we have two servers - svr1.orcsweb.com and svr2.orcsweb.com - that have 100% the same configuration and content. We could configure a special switch* to handle traffic that is sent to www.orcsweb.com and redirect the traffic to either of these nodes depending on some routing logic. All clients visiting the main URL (in this case www.orcsweb.com) have no idea whether this is a single server - or ten servers! The balancing between nodes is seamless and transparent.
[*note: There is also software that could handle the routing process but experience and test have shown that these types of solutions are generally not as scalable, fast, or efficient as the hardware switch solutions]
The routing logic can be a number of different options - most common are:
- Round-robin: Each node gets a request sent to it "in turn". So, node1 gets a request, then node2 again, then node1, then node2 again.
- Least Active: Whichever node shows to have the lowest number of current connects gets new connects sent to it. This is good to help keep the load balanced between the server nodes.
- Fastest Reply: Whichever node replies faster is the one that gets new requests. This is also a good option - especially if there are nodes that might not be "equal" in performance. If one performs much better than the other, why not send more requests there?
In any of these scenarios the switch will also detect if a node were to fail. So, if svr1.orcsweb.com was taken offline for maintenance - or it had a critical failure - the switch would detect that and only send traffic to svr2.orcsweb.com. And since the clients always access the site via the main URL (not the node names) they have no idea that one of the nodes is down - the application continues to serve client requests seamlessly.
Besides high-availability (continuing to satisfy requests during a failure), a webfarm also gives an application a higher level of scalability - the ability to handle more and more load. If load increased on the application to the point where performance started to degrade, more nodes could be added to the webfarm (again, without clients noticing), giving the ability to handle potentially unlimited levels of traffic (just keep adding nodes!).
Of course there are a lot of factors surrounding the proper support of a webfarm - the switches, fail over between switches (don't let the switch be a single point-of-failure!), replication of content, synchronization of server changes, synchronization of application changes, etc, etc.. But a good system administrator (or experienced hosting company) can help address all of these issues for you.
Hopefully this has been a good introduction to webfarms for you, and hopefully I've properly communicated enough of the benefits for you to consider this as a hosting option for yourself. With the rates now down to affordable levels - why not get this additional layer of protection?
Happy hosting!