Networking and Hadoop Clusters

By Dirk deRoos

As with any distributed system, networking can make or break a Hadoop cluster: Don’t “go cheap.” A great deal of chatter takes place between the master nodes and slave nodes in a Hadoop cluster that is essential in keeping the cluster running, so enterprise-class switches are definitely recommended.

For each rack in your cluster, you need two top-of-rack (ToR) switches, for both redundancy and performance. Use 10GbE for ToR switches.

ToR switches are network switches that connect all the computers in a rack together. You normally see them at the very top of a rack, which is why people say “top-of-rack.” An alternative networking approach is to use end-of-row (EoR) switches but, you don’t see this very often.

The ToR approach is simpler from a networking perspective for growing clusters. For example, adding slave nodes and additional racks is far easier with ToR switches than EoR.

When you have more than three racks, you need at least two core switches (again, primarily for redundancy, but also for performance). These core switches handle massive amounts of traffic, so 40GbE is a necessity.

If you’re building or expanding a cluster to span multiple racks, engage networking experts who are familiar with Hadoop, your future growth plans, and your workload. Bad networking can severely hamper performance, but it can also make future growth painful and expensive.