YARN Container Launch Details

YARN Container Launch Details

Resource Model

YARN supports a very general resource model for applications. An application (via the ApplicationMaster) can request resources with highly specific requirements such as:

  • Resource-name (hostname, rackname)
  • Memory (in MB)
  • CPU (cores)

ResourceRequest

Essentially an application can ask for specific resource requests via the ApplicationMaster to satisfy its resource needs. The Scheduler responds to a resource request by granting a container.

ResourceRequest has the following form:

<resource-name, priority, resource-requirement, number-of-containers>

Container

Essentially, the Container is the resource allocation, which is the successful result of the ResourceManager granting a specific ResourceRequest. A Container grants rights to an application to use a specific amount of resources (memory, cpu etc.) on a specific host.

The ApplicationMaster has to take the Container and present it to the NodeManager managing the host, on which the container was allocated, to use the resources for launching its tasks.

Container Specification during Container Launch

The ApplicationMaster has to provide considerably more information to the NodeManager to actually launch the container.

The YARN Container launch specification API is contains:

  • Command line to launch the process within the container.
  • Environment variables.
  • Local resources necessary on the machine prior to launch, such as jars, shared-objects, auxiliary data files etc.
  • Security-related tokens.

Container Launch

On receiving a container-launch request, the NodeManager performs the following set of steps to launch the container.

1.A local copy of all the specified resources is created.

2.Isolated work directories are created for the container, and the local resources are made available in these directories.

3.The launch environment and command line is used to start the actual container.

Files used for launching Containers:

  1. default_container_executor.sh
  2. launch_container.sh
  3. job.xml
  4. job.jar
  5. container_tokens

Details:

1) default_container_executor.sh

  • Sets PID
  • Executes launch_container.sh script

2) launch_container.sh

  • Sets environments variables-

Eg. YARN_LOCAL_DIRS, SHELL, HADOOP_COMMON_HOME, JAVA_HOME etc.

  • Set Classpath-

Includes all necessary jar files path that are required for running map/reduce task.

  • Launches JVM of YarnChild in which map / reduce task will run.

 3) job.xml

  • Contains all configuration properties of yarn, dfs, map-reduce and hadoop to run map / reduce task

 4) job.jar

  • Input Jar of MapReduce Application

 5) container_tokens

  • Secuirty Token File
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s