Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-3926

No information of unfinished map task in Job History, if all attempts of another map task fail.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 0.20.205.0
    • None
    • jobtracker
    • None

    Description

      No information of unfinished map task in Job History, if all attempts of another map task fail.

      For example,
      1. The first map task's first attempt m_000000_0 was making progress

      2. The second map task failed 4 times, before completion of first map task attempt.

      3. Hence, a job cleanup task was launched and completed, before completion of first map task attempt.

      4. After job cleanup task, runningMapCache is cleaned

      completedTask() -> jobComplete() -> garbageCollect() ->  this.runningMapCache = null;
                 |-----> retireMap() -> if (runningMapCache == null) "Running cache for maps missing!! Job details are missing."
      

      5. Hence, "Running cache for maps missing!! Job details are missing." error comes
      (from retireMap() which is called after jobComplete() ) and no information is
      added further to Job History. Therefore, first map task's information is
      missing from Job History page.

      I have created a sample streaming MR job, to reproduce this issue.

      mapper.sh
      #!/bin/bash
      read line
      if [[ "$line" == "sleep" ]]
      then
          for i in 1 2 3
          do
              echo "Sleeping" >&2
              sleep 5
          done
          exit 0
      else
          echo "Exiting" >&2
          exit -1
      fi
      

      Input file: in1.txt is for long running map task (here first map task)

      /user/mitesh/input/in1.txt
      sleep
      

      Input file: in2.txt is for failing map task (here second map task)

      /user/mitesh/input/in2.txt
      exit
      

      Running the sample streaming MR job.

      $ hadoop fs -rmr -skipTrash xyz
      $ hadoop jar $HADOOP_INSTALL/hadoop-streaming.jar -Dmapred.map.max.attempts=2 -Dmapred.min.split.size=7 -Dmapred.map.tasks=2 -mapper "mapper.sh" -file mapper.sh -reducer NONE -input /user/mitesh/input/in1.txt -input /user/mitesh/input/in2.txt -output xyz
      

      Job History web UI

      Hadoop Job job_201201310454_542302 on History Viewer
      User: mitesh
      JobName: streamjob7439640883203077520.jar
      JobConf: hdfs://nn:port/user/mitesh/.staging/job_201201310454_542302/job.xml
      Job-ACLs:
          mapreduce.job.acl-view-job: No users are allowed
          mapreduce.job.acl-modify-job: No users are allowed
      Submitted At: 27-Feb-2012 12:56:02
      Launched At: 27-Feb-2012 12:56:11 (8sec)
      Finished At: 27-Feb-2012 12:56:31 (20sec)
      Status: FAILED
      Failure Info: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201201310454_542302_m_000001
      Analyse This Job
      Kind	Total Tasks(successful+failed+killed)	Successful tasks	Failed tasks	Killed tasks	Start Time	Finish Time
      Setup 	1 	1 	0 	0 	27-Feb-2012 12:56:12 	27-Feb-2012 12:56:16 (4sec)
      Map 	2 	0 	2 	0 	27-Feb-2012 12:56:16 	27-Feb-2012 12:56:26 (10sec)
      Reduce 	0 	0 	0 	0 		
      Cleanup 	1 	1 	0 	0 	27-Feb-2012 12:56:26 	27-Feb-2012 12:56:31 (4sec)
      

      Above it shows, only 2 failed tasks (belong to second map task).
      Only from JT logs, the task tracker of first map task can be found.

      Attachments

        Activity

          People

            Unassigned Unassigned
            miteshsjat Mitesh Singh Jat
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: