Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1398

TaskLauncher remains stuck on tasks waiting for free nodes even if task is killed.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.21.0
    • tasktracker
    • None
    • Reviewed
    • Fixed TaskLauncher to stop waiting for blocking slots, for a TIP that is killed / failed while it is in queue.

    Description

      Tasks could be assigned to trackers for slots that are running other tasks in a commit pending state. This is an optimization done to pipeline task assignment and launch. When the task reaches the tracker, it waits until sufficient slots become free for it. This wait is done in the TaskLauncher thread. Now, while waiting, if the task is killed externally (maybe because the job finishes, etc), the TaskLauncher is not notified of this. So, it continues to wait for the killed task to get sufficient slots. If slots do not become free for a long time, this would result in considerable delay in waking up the TaskLauncher thread. If the waiting task happens to be a high RAM task, then it is also wasteful, because by waking up, it can make way for normal tasks that can run on the available number of slots.

      Attachments

        1. mr-1398-y20.patch
          11 kB
          Hemanth Yamijala
        2. patch-1398.txt
          11 kB
          Amareshwari Sriramadasu
        3. patch-1398-1.txt
          11 kB
          Amareshwari Sriramadasu
        4. patch-1398-2.txt
          12 kB
          Amareshwari Sriramadasu
        5. patch-1398-ydist.txt
          11 kB
          Amareshwari Sriramadasu

        Activity

          People

            amareshwari Amareshwari Sriramadasu
            yhemanth Hemanth Yamijala
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: