Hadoop Common
  1. Hadoop Common
  2. HADOOP-142

failed tasks should be rescheduled on different hosts after other jobs

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.1.1
    • Fix Version/s: 0.2.0
    • Component/s: None
    • Labels:
      None

      Description

      Currently when tasks fail, they are usually rerun immediately on the same host. This causes problems in a couple of ways.
      1.The task is more likely to fail on the same host.
      2.If there is cleanup code (such as clearing pendingCreates) it does not always run immediately, leading to cascading failures.

      For a first pass, I propose that when a task fails, we start the scan for new tasks to launch at the following task of the same type (within that job). So if maps[99] fails, when we are looking to assign new map tasks from this job, we scan like maps[100]...maps[N], maps[0]..,maps[99].

      A more involved change would avoid running tasks on nodes where it has failed before. This is a little tricky, because you don't want to prevent re-excution of tasks on 1 node clusters and the job tracker needs to schedule one task tracker at a time.

        Activity

        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        23h 42m 1 Doug Cutting 19/Apr/06 06:07
        Resolved Resolved Closed Closed
        17d 28m 1 Doug Cutting 06/May/06 06:35
        Owen O'Malley made changes -
        Component/s mapred [ 12310690 ]
        Doug Cutting made changes -
        Workflow no-reopen-closed [ 12373413 ] no-reopen-closed, patch-avail [ 12377724 ]
        Doug Cutting made changes -
        Workflow no reopen closed [ 12373077 ] no-reopen-closed [ 12373413 ]
        Doug Cutting made changes -
        Workflow jira [ 12361184 ] no reopen closed [ 12373077 ]
        Doug Cutting made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Doug Cutting made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Doug Cutting added a comment -

        I just committed this. Thanks, Owen!

        Show
        Doug Cutting added a comment - I just committed this. Thanks, Owen!
        Owen O'Malley made changes -
        Field Original Value New Value
        Attachment no-repeat-failures.patch [ 12325520 ]
        Hide
        Owen O'Malley added a comment -

        This patch does three things:
        1. When a task fails, it sets the following task to be the first to be checked for assignment to a TaskTracker.
        2. Tasks prefer not to run on TaskTrackers where they have failed before.
        3. Speculative tasks will not run on TaskTrackers where they have failed.

        Show
        Owen O'Malley added a comment - This patch does three things: 1. When a task fails, it sets the following task to be the first to be checked for assignment to a TaskTracker. 2. Tasks prefer not to run on TaskTrackers where they have failed before. 3. Speculative tasks will not run on TaskTrackers where they have failed.
        Hide
        Runping Qi added a comment -

        Ideally, it will be the best if the tasktracker can diagnose whether the failure was task specific or is a general case.
        If it is a general case, the tasktracker should generate some alter and stop polling for tasks before the problem is corrected.

        Here is a case I encountered. For some reason, the tmp dir of the DFS was not writable:

        060417 144314 task_r_3e1f4h Error running child
        060417 144314 task_r_3e1f4h java.io.FileNotFoundException: /export/crawlspace2/k
        ryptonite/hadoop/dfs/data/tmp/client-5535743708351505322 (Read-only file system)
        060417 144314 task_r_3e1f4h at java.io.FileOutputStream.open(Native Method)
        060417 144314 task_r_3e1f4h at java.io.FileOutputStream.<init>(FileOutputStr
        eam.java:179)
        060417 144314 task_r_3e1f4h at java.io.FileOutputStream.<init>(FileOutputStr
        eam.java:131)
        060417 144314 task_r_3e1f4h at org.apache.hadoop.dfs.DFSClient$DFSOutputStre
        am.<init>(DFSClient.java:576)
        060417 144314 task_r_3e1f4h at org.apache.hadoop.dfs.DFSClient.create(DFSCli
        ent.java:127)
        060417 144314 task_r_3e1f4h at org.apache.hadoop.dfs.DistributedFileSystem.c
        reateRaw(DistributedFileSystem.java:83)
        060417 144314 task_r_3e1f4h at org.apache.hadoop.fs.FSDataOutputStream$Summe
        r.<init>(FSDataOutputStream.java:43)
        060417 144314 task_r_3e1f4h at org.apache.hadoop.fs.FSDataOutputStream.<init
        >(FSDataOutputStream.java:132)
        060417 144314 task_r_3e1f4h at org.apache.hadoop.fs.FileSystem.create(FileSy
        stem.java:201)
        060417 144314 task_r_3e1f4h at org.apache.hadoop.fs.FileSystem.create(FileSy
        stem.java:168)
        060417 144314 task_r_3e1f4h at com.yahoo.yst.crawl.aggregation.CrawledDocOut
        putFormat.getRecordWriter(CrawledDocOutputFormat.java:39)
        060417 144314 task_r_3e1f4h at org.apache.hadoop.mapred.ReduceTask.run(Reduc
        eTask.java:265)
        060417 144314 task_r_3e1f4h at org.apache.hadoop.mapred.TaskTracker$Child.ma
        in(TaskTracker.java:709)

        In this case, all reduce tasks will fail. Thus, it does not make sense to get any new reduce tasks.

        In some other case, if the map.local.dir is not writable, then both map tasks and reduce rasks will fail.
        In this case, it should stop polling for new tasks.

        Show
        Runping Qi added a comment - Ideally, it will be the best if the tasktracker can diagnose whether the failure was task specific or is a general case. If it is a general case, the tasktracker should generate some alter and stop polling for tasks before the problem is corrected. Here is a case I encountered. For some reason, the tmp dir of the DFS was not writable: 060417 144314 task_r_3e1f4h Error running child 060417 144314 task_r_3e1f4h java.io.FileNotFoundException: /export/crawlspace2/k ryptonite/hadoop/dfs/data/tmp/client-5535743708351505322 (Read-only file system) 060417 144314 task_r_3e1f4h at java.io.FileOutputStream.open(Native Method) 060417 144314 task_r_3e1f4h at java.io.FileOutputStream.<init>(FileOutputStr eam.java:179) 060417 144314 task_r_3e1f4h at java.io.FileOutputStream.<init>(FileOutputStr eam.java:131) 060417 144314 task_r_3e1f4h at org.apache.hadoop.dfs.DFSClient$DFSOutputStre am.<init>(DFSClient.java:576) 060417 144314 task_r_3e1f4h at org.apache.hadoop.dfs.DFSClient.create(DFSCli ent.java:127) 060417 144314 task_r_3e1f4h at org.apache.hadoop.dfs.DistributedFileSystem.c reateRaw(DistributedFileSystem.java:83) 060417 144314 task_r_3e1f4h at org.apache.hadoop.fs.FSDataOutputStream$Summe r.<init>(FSDataOutputStream.java:43) 060417 144314 task_r_3e1f4h at org.apache.hadoop.fs.FSDataOutputStream.<init >(FSDataOutputStream.java:132) 060417 144314 task_r_3e1f4h at org.apache.hadoop.fs.FileSystem.create(FileSy stem.java:201) 060417 144314 task_r_3e1f4h at org.apache.hadoop.fs.FileSystem.create(FileSy stem.java:168) 060417 144314 task_r_3e1f4h at com.yahoo.yst.crawl.aggregation.CrawledDocOut putFormat.getRecordWriter(CrawledDocOutputFormat.java:39) 060417 144314 task_r_3e1f4h at org.apache.hadoop.mapred.ReduceTask.run(Reduc eTask.java:265) 060417 144314 task_r_3e1f4h at org.apache.hadoop.mapred.TaskTracker$Child.ma in(TaskTracker.java:709) In this case, all reduce tasks will fail. Thus, it does not make sense to get any new reduce tasks. In some other case, if the map.local.dir is not writable, then both map tasks and reduce rasks will fail. In this case, it should stop polling for new tasks.
        Owen O'Malley created issue -

          People

          • Assignee:
            Owen O'Malley
            Reporter:
            Owen O'Malley
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development