The concept of Load Balancing is that tasks or requests are distributed on to multiple computers. For eg, I make a standard HTTP request from a client to access a web application, it gets directed to multiple web servers. What basically happens here is the application’s work load is distributed among multiple computers, making it more scalable. It also helps in providing redundancy, so let’s say one server fails in a cluster, the load balancer distributes the load among the remaining servers. However when an error happens, and the request is moved from a failing server to a functional server it is called as “failover”.
Cluster is typically a set of servers running the same application and it’s purpose is two fold. To distribute the load onto different servers and provide redundancy/failover mechanism.
What are the common load balancing schemes?
Even Task Distribution Scheme
Also called as “Round Robin” here the tasks are evenly distributed between the servers in a cluster.Here each task is evenly distributed to a server. It works basically when all servers have the same capacity, and all tasks need the same amount of effort. The issue here though is that this methodology does not consider that each task, could have a different effort to be processed. So you could have a situation, where let’s say all servers are given 3 Tasks- T1, T2 and T3. On the face of it seems equal, however T1,T2 on Server 1 needs more effort to process, than T1,T2 on Server 2,3. So in effect Server 1 is bearing a heavier load here, than Server 2 and 3, inspite of an even distribution.
DNS Based Load Balancing
Here you ensure the DNS, is configured to return different IP addresses, when an IP address is requested for your domain name. It is almost similar to the Round Robin scheme, except that computers here cache the IP address, and keep referring to it, until a new look up is made. However it is not really a recommended approach, and it is more advisable to use a load balancer software.
Weighted Task Distribution Scheme
As the name indicates, the tasks here are distributed in a relative ratio to other servers. It works if all servers in a cluster don’t have the same capacity. Assume we have 3 servers, and one of the server’s capacity is 1/2 of the other two servers. Now assume that there are 10 tasks to be distributed among the servers. In this case both Server 1, 2 would get 10 tasks, however since Server 3 has only 1/2 of the capacity, it would be getting only 5 tasks. So here the task distribution is done as per server capacity, relative to the other server. However this still does not consider the processing capability of each task.
Sticky Session Scheme
So far in the previous load balancing schemes, we have made the assumption that any incoming request is independent of the other requests. Let us consider a typical Java Web Application, where a request arrives at Server 1, and some values are written into session state. Now the same user makes another request, which is sent to Server 2, you could have a scenario, where it is unable to get the session data, as it is stored in Server 1.
Typical scenario, user logs in, enters credentials, is validated and taken to home page. Now the first request the user makes, we store the user id and password in a session on Server 1, as that would be needed across the application. Now the next request from user, say to navigate from home page to a registration page, is sent to Server 2. Here we would be needing the user credentials, which however is stored in a session State on Server 1. So we end up with a scenario, where the request could not be getting the session data.
This can be resolved using Sticky Session load balancing, where instead of distributing the tasks among servers, we distribute the sessions instead. So basically all tasks by the same user in a session are sent to the same server to prevent loss of data. For eg if we have a Shopping Cart application, the entire session( User Logs In-> Home page->Selects Items-> Pay Bill) will be distributed to one server per user. This in turn could result in an uneven distribution of work load, as some sessions could have more tasks compared to others.
Even Size Task Queue Distribution Scheme
This is similar to the Weighted Task distribution, however instead of just routing all the tasks to a server at once, they are kept in a queue. So assuming 10 tasks are to be distributed to each server, they are kept in a queue. These queues contain the tasks that are being processed by the server or going to be processed. Whenever a task is over, it is removed from the queue for that server.
Here we ensure that each server queue has the same amount of tasks in progress. Servers with higher capacity finish the tasks faster, ensuring space for remaining tasks. Here we are taking into consideration the capacity of the server, as well as the effort needed to process each task. Any new task, is sent to server whose queue has fewer tasks lined up. In the event of a server being overloaded, it’s queue size becomes larger than task queues of other servers. And this overloaded server does not have any new tasks assigned to it, until it’s queue is cleared.