Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3713

broken symlinks in jobcache when local tasks are done but job is in progress

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Duplicate
    • 0.17.0
    • None
    • None
    • None

    Description

      When all running tasks on a tasktracker are done, not all links for /<mapred.local.dir>/taskTracker/jobcache/<job>/work are deleted. This is resulting in new tasks from the same job scheduled on this node to fail with

      2008-07-07 17:44:49,756 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction: task_200807071715_0022_r_000295_0
      2008-07-07 17:44:49,773 WARN org.apache.hadoop.mapred.TaskTracker: Error initializing task_200807071715_0022_r_000295_0:
      java.io.IOException: Mkdirs failed to create /tmp3/taskTracker/jobcache/job_200807071715_0022/work
      at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:680)
      at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1274)
      at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:915)
      at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1310)
      at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2251)

      $ ls -lt /tmp3/taskTracker/jobcache/job_200807071715_0022/work
      lrwxrwxrwx 1 user users 135 Jul 7 17:44 /tmp3/taskTracker/jobcache/job_200807071715_0022/work -> /tmp0/taskTracker/jobcache/job_200807071715_0022/work
      $ ls -lt /tmp0/mapred-local/taskTracker/jobcache/job_200807071715_0022/work
      ls: /tmp0/taskTracker/jobcache/job_200807071715_0022/work: No such file or directory

      Earlier tasks scheduled on this tasktracker have completed successfully

      2008-07-07 17:44:44,926 INFO org.apache.hadoop.mapred.TaskRunner: task_200807071715_0022_r_000004_0 done; removing files.
      2008-07-07 17:44:44,931 INFO org.apache.hadoop.mapred.TaskRunner: task_200807071715_0022_r_000176_0 done; removing files.
      2008-07-07 17:44:44,958 INFO org.apache.hadoop.mapred.TaskRunner: task_200807071715_0022_r_000210_0 done; removing files.
      2008-07-07 17:44:49,486 INFO org.apache.hadoop.mapred.TaskRunner: task_200807071715_0022_r_000153_0 done; removing files.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rajive Rajiv Chittajallu
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: