YARN’s Application Master in Hadoop - dummies

YARN’s Application Master in Hadoop

By Dirk deRoos

Unlike other YARN (Yet Another Resource Negotiator) components, no component in Hadoop 1 maps directly to the Application Master. In essence, this is work that the JobTracker did for every application, but the implementation is radically different.

Each application running on the Hadoop cluster has its own, dedicated Application Master instance, which actually runs in a container process on a slave node (as compared to the JobTracker, which was a single daemon that ran on a master node and tracked the progress of all applications).

Throughout its life (for example, while the application is running), the Application Master sends heartbeat messages to the Resource Manager with its status and the state of the application’s resource needs. Based on the results of the Resource Manager’s scheduling, it assigns container resource leases — basically reservations for the resources containers need — to the Application Master on specific slave nodes.

The Application Master oversees the full lifecycle of an application, all the way from requesting the needed containers from the Resource Manager to submitting container lease requests to the NodeManager.

Each application framework that’s written for Hadoop must have its own Application Master implementation. MapReduce, for example, has a specific Application Master that’s designed to execute map tasks and reduce tasks in sequence.