Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-400

the job tracker re-runs failed tasks on the same node

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.4.0
    • 0.6.0
    • None
    • None

    Description

      The job tracker tries not to run tasks that have previously failed on a node on that node again, but it doesn't strictly prevent it.

      I propose to change the rule so that when pollForNewTask is called by a TaskTracker, the JobTracker will only assign it a task that has failed on that TaskTracker, if and only if it has already failed on the entire cluster. Thus, for "normal" clusters with more than 4 TaskTrackers, you will be guaranteed that it will run on 4 different TaskTrackers. For small clusters, it will run on every TaskTracker in the cluster at least once.

      Does that sound reasonable to everyone?

      Attachments

        Activity

          People

            omalley Owen O'Malley
            omalley Owen O'Malley
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: