Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3864

JobTracker lockup due to JobInProgress.initTasks taking significant time for large jobs on large clusters

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.18.0
    • Fix Version/s: 0.19.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      JobInProgress.initTasks takes significant amount of time on a large cluster for large jobs (55k maps * 3 splits), during which the JobInProgress object is locked up.

      Simultaneously the JobClient is calling JobTracker.getTaskCompletionEvents which locks the JobTracker & tries to lock the JobInProgress, there-by it starves all heartbeats which are trying to lock the JobTracker - resulting in a lockup.

        Attachments

        1. HADOOP-3864_0_20080830.patch
          4 kB
          Arun Murthy

          Activity

            People

            • Assignee:
              acmurthy Arun Murthy
              Reporter:
              acmurthy Arun Murthy
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: