Hadoop Distributed File System (HDFS) High Availability

By Dirk deRoos

Often in Hadoop’s infancy, a great amount of discussion was centered on the NameNode’s representation of a single point of failure. Hadoop, overall, has always had a robust and failure-tolerant architecture, with the exception of this key area. Without the NameNode, there’s no Hadoop cluster.

Using Hadoop 2, you can configure HDFS so that there’s an Active NameNode and a Standby NameNode. The Standby NameNode needs to be on a dedicated master node that’s configured identically to the master node used by the Active NameNode.

image0.jpg

The Standby NameNode isn’t sitting idly by while the NameNode handles all the block address requests. The Standby NameNode, charged with the task of keeping the state of the block locations and block metadata in memory, handles the HDFS checkpointing responsibilities.

The Active NameNode writes journal entries on file changes to the majority of the JournalNode services, which run on the master nodes. (Note: The HDFS high availability solution requires at least three master nodes, and if there are more, there can be only an odd number.)

If a failure occurs, the Standby Node first reads all completed journal entries (where a majority of Journal Nodes have an entry, in other words), to ensure that the new Active NameNode is fully consistent with the state of the cluster.

Zookeeper is used to monitor the Active NameNode and to handle the failover logistics if the Active NameNode becomes unavailable. Both the Active and Standby NameNodes have dedicated Zookeeper Failover Controllers (ZFC) that perform the monitoring and failover tasks. In the event of a failure, the ZFC informs the Zookeeper instances on the cluster, which then elect a new Active NameNode.

Apache Zookeeper provides coordination and configuration services for distributed systems, so it’s no wonder we see it used all over the place in Hadoop.