Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1398

TaskLauncher remains stuck on tasks waiting for free nodes even if task is killed.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: tasktracker
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Fixed TaskLauncher to stop waiting for blocking slots, for a TIP that is killed / failed while it is in queue.

      Description

      Tasks could be assigned to trackers for slots that are running other tasks in a commit pending state. This is an optimization done to pipeline task assignment and launch. When the task reaches the tracker, it waits until sufficient slots become free for it. This wait is done in the TaskLauncher thread. Now, while waiting, if the task is killed externally (maybe because the job finishes, etc), the TaskLauncher is not notified of this. So, it continues to wait for the killed task to get sufficient slots. If slots do not become free for a long time, this would result in considerable delay in waking up the TaskLauncher thread. If the waiting task happens to be a high RAM task, then it is also wasteful, because by waking up, it can make way for normal tasks that can run on the available number of slots.

      1. mr-1398-y20.patch
        11 kB
        Hemanth Yamijala
      2. patch-1398.txt
        11 kB
        Amareshwari Sriramadasu
      3. patch-1398-1.txt
        11 kB
        Amareshwari Sriramadasu
      4. patch-1398-2.txt
        12 kB
        Amareshwari Sriramadasu
      5. patch-1398-ydist.txt
        11 kB
        Amareshwari Sriramadasu

        Activity

          People

          • Assignee:
            Amareshwari Sriramadasu
            Reporter:
            Hemanth Yamijala
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development