I was always wondered the difference between clusters and horizontal scaling. I was under an impression that both are same. Recently i was reading this link about the Map reduce that forced me to think again on the difference between clustering and horizontal scaling. I have discussed the same with one of my peer as well to get the clarity. This is what my understanding regarding the same.
Horizontal scaling is used in the case of scalability. If your application getting more traffic than usual then you need to scale the system to handle more requests like you add more suppliers in the restaurants when there are more people visiting the same. So each supplier can independently work to attain his goal. In our case we will add more hosts to the application which will handle requests completely independent. But who will tell these requests to go to this particular host ? load balancer comes into the picture now. Load balancer will route the request to hosts which having lesser load at that point of time. Load balancer is the external facing server , so it has to be highly available. So we must need to keep a backup node for the load balancer. So the time when the main load balancer dies then the backup should take the control.
Clustering is used when we have a computationally large problem and we need more resources to solve it parallel. So we will break the problem into subtasks and assign different tasks to different nodes and the results will be combined together to make the final result. This is distributed computing. MapReduce is helping us to create these distributed tasks.
Map will break down the tasks and give to all the mapping node, there will be a aggregator which will aggregate and give the result to the reducer. Reducer will reduce the map results to the final result.
You can read more about the map reduce function in this article.
Want to understand mapreduce through an example, this article is a good start.
Horizontal scaling is used in the case of scalability. If your application getting more traffic than usual then you need to scale the system to handle more requests like you add more suppliers in the restaurants when there are more people visiting the same. So each supplier can independently work to attain his goal. In our case we will add more hosts to the application which will handle requests completely independent. But who will tell these requests to go to this particular host ? load balancer comes into the picture now. Load balancer will route the request to hosts which having lesser load at that point of time. Load balancer is the external facing server , so it has to be highly available. So we must need to keep a backup node for the load balancer. So the time when the main load balancer dies then the backup should take the control.
Clustering is used when we have a computationally large problem and we need more resources to solve it parallel. So we will break the problem into subtasks and assign different tasks to different nodes and the results will be combined together to make the final result. This is distributed computing. MapReduce is helping us to create these distributed tasks.
Map will break down the tasks and give to all the mapping node, there will be a aggregator which will aggregate and give the result to the reducer. Reducer will reduce the map results to the final result.
You can read more about the map reduce function in this article.
Want to understand mapreduce through an example, this article is a good start.