Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3864

JobTracker lockup due to JobInProgress.initTasks taking significant time for large jobs on large clusters

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 0.18.0
    • 0.19.0
    • None
    • None
    • Reviewed

    Description

      JobInProgress.initTasks takes significant amount of time on a large cluster for large jobs (55k maps * 3 splits), during which the JobInProgress object is locked up.

      Simultaneously the JobClient is calling JobTracker.getTaskCompletionEvents which locks the JobTracker & tries to lock the JobInProgress, there-by it starves all heartbeats which are trying to lock the JobTracker - resulting in a lockup.

      Attachments

        1. HADOOP-3864_0_20080830.patch
          4 kB
          Arun Murthy

        Activity

          People

            acmurthy Arun Murthy
            acmurthy Arun Murthy
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: