Squaring off with your high availability terminology

Following our previous post on the basics of high-availability services, it occurred to us that there’s often some confusion about the use of certain terms and phrases. We’d like to clear that up before pressing on, and hopefully reduce some of the headaches for people in the long run.

We’re dealing with a few closely related terms here, with important differences in meaning:

  • High availability
  • Load balancing
  • Linux HA

High Availability (HA) is a concept and a goal. How you achieve it is up to you, but the implication is that it involves more than one server, because a single server is a single point of failure.

Having a hot-standby server to takeover in the event of a failure is one way to get HA. For certain types of services this is the most appropriate method, and Anchor uses the Linux HA software to do this.

Another option is to run a team of identical servers, all serving requests, with the intention that if some of them fail, the others will keep going. This is generally called load-balancing, and is best suited to things like web frontends serving http/https requests.

Load Balancing refers to having a pool of two or more servers serving requests for clients. Each server is identical as far as clients are concerned, it doesn’t matter which server in the pool answers the request.

Load balancing is a popular HA technique because it’s also scalable – capacity can be increased roughly-linearly by adding more servers to the pool. Failures are handled gracefully by dropping the server from the pool.

Load balancing isn’t a free lunch though. The load balancer itself, which sits in front of the servers and distributes incoming requests evenly, is a critical single point of failure. Now you need HA for your load balancer, and you might also hit the performance-wall if you see enough traffic.

It’s also important to keep in mind that a given client won’t always be served by the same server in the pool. This breaks some common assumptions about how things work (eg. session state), so a degree of care is needed when using load balancing, and some services are difficult if not impossible to load balance (eg. databases).

The Linux HA project is a suite of software components used for building high availability systems, and is considered to be The HA solution on Linux. The primary components are Corosync (the cluster messaging layer) and Pacemaker (the cluster resource manager).

Linux HA manages “resources” on a cluster of servers. If a resource stops working or the server hosting a resource dies, it’s failed-over to another server in the cluster to keep it running. Linux HA doesn’t perform any load balancing, there are other tools such as ldirector for that purpose.

Corosync and Pacemaker are powerful tools, but also very complex. The learning curve to get started is steep, and maintenance requires a fine hand to avoid shooting yourself in the foot. Properly implemented, a Linux HA cluster is very reliable and delivers excellent uptime. If that’s what you’re after, why not employ experts, like yours truly here at Anchor? 😉

That’s it for this instalment, feel free to leave a comment if anything isn’t clear or needs elaboration.

Next time we’ll talk about different options for deploying Linux HA clusters, and what’s suitable in various situations.