Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1682

Tasks should not be scheduled after tip is killed/failed.

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: jobtracker
    • Labels:
      None

      Description

      We have seen the following scenario in our cluster:
      A job got marked failed, because four attempts of a TIP failed. This would kill all the map and reduce tips. Then a job-cleanup attempt is launched.
      The job-cleanup attempt failed because it could not report status for 10 minutes. There are 3 such job-cleanup attempts leading the job to get killed after 1/2 hour.
      While waiting for the job cleanup to finish, JobTracker scheduled many tasks of the job on TaskTrackers and sent a KillTaskAction in the next heartbeat.

      This is just wasting lots of resources, we should avoid scheduling tasks of a tip once the tip is killed/failed.

        Activity

        Amareshwari Sriramadasu created issue -
        Todd Lipcon made changes -
        Field Original Value New Value
        Attachment mapreduce-1682-ydh.txt [ 12452467 ]
        Todd Lipcon made changes -
        Assignee Arun C Murthy [ acmurthy ]
        Owen O'Malley made changes -
        Fix Version/s 0.20.3 [ 12314813 ]
        Allen Wittenauer made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            Arun C Murthy
            Reporter:
            Amareshwari Sriramadasu
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development