Map Reduce Details-
• The client-
MapReduce job submission is done by client.
• The jobtracker
Responsible for co-ordinating the Job run.
• The tasktrackers
Job is collection of task, (Splitted into several tasks) Tasktrackers is responsible for running job.
It is distributed file System used by Hadoop. Input, intermediate and Output files are put on HDFS in case of Job runs in Hadoop pseudo Distributed and fully distributed mode.
1) JobClients submitJob() method requests for jobtracker for a new job ID with getNewJobId() method.
2) For job to run, it needs some resources, for example-Job Jar file, some configuration files
and Input Split(Input is splitted and each task run on given split). So these all resources needs to be copied at jobtrackers directory. (Dir inside JobID directory).
1) JobTracker puts job in queue.
2) Scheduler picks it and initialize it.
3) Job Scheduler retrieves input splits. For each input split one map task is created. Task also gets ID.
4) Number of reduce tasks can also determined by mapred.reduce.tasks property.
1) Tasktrackers periodically sends heartbeat to JobTracker saying I am alive.
2) Tasktracker also conveys whether it is ready to run new task, If yes then JobTracker will allocate it a task.
3) Tasktracker has fixed numbers of slot for map and reduce tasks. Where map tasks gets priority over reduce task in case of default scheduler.
4) Data-local task: Task is running on same node that input splits resides on.
5) Rack-local task: Task and input data split are not same node but on same rack.
It is possible that some tasks are neither data-local nor rack-local
1) Once tasktracker has been assigned a task, It needs resources to run it. So it copies Job Jar from shared filesystem to tasktrackers file system. It creates local working directory for task and extracts all contents of jar into that directory.
2) Then It creates Task Runner Instance to run new task. Task Runner launches a new Java Virtual Machine to run each task. Reason behind seperate JVM is bugs in users MapReduce App shouldn’t affect tasktracker.
1) JobClient is polling for status to JobTracker. As soon as JobTracker gets notification for last task status it announces job is complete.
1) If users MapReduce code throws runtime exception then task gets failed.
2) The tasktracker marks the task attempt as failed, freeing up a slot to run another task.
1) TaskTracker may crash and fail in such cases it stops sending heartbeats to jobtracker.
2) JobTracker check for property mapred.tasktracker.expiry.interval property, in milliseconds.
If it doesnot receives heartbeat till this interval then it learns that tasktracker is failed and remove it from its pool of tasktrackers to schedule tasks on.
1) Failure of the jobtracker is the most serious failure mode. Currently, Hadoop has no mechanism for dealing with failure of the jobtracker.
Shuffle and Sort
1) Out put of Map tasks are input to reducer tasks, so process of transferring it is called as shuffle.
2) Map Tasks output is not written to disk directly but they have circular buffer of size 100 mb. When 805 of capacity of buffer is complete then data is spilled to disk. Spill files gets created at value-mapred.local.dir property in job specific sub directory.
3) Threads divides data into partitions and performs sort and sends output to combiner class. Output file are available for reducer as input over HTTP. Reducer has copier threads which fetches map data. No of copier threads are configurable – mapred.reduce.parallel.copies property.
4) How do reducers know which tasktrackers to fetch map output from-
Heartbeat notifications are sent as follows-
Where a thread in the reducer periodically asks the jobtracker for map output locations.