Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.23.3, 2.0.1-alpha
    • Fix Version/s: 2.0.3-alpha, 0.23.6
    • Component/s: applicationmaster
    • Labels:
      None

      Description

      The AM calls the output committer's commitJob method synchronously during JobImpl state transitions, which means the JobImpl write lock is held the entire time the job is being committed. Holding the write lock prevents the RM allocator thread from heartbeating to the RM. Therefore if committing the job takes too long (e.g.: the job has tons of files to commit and/or the namenode is bogged down) then the AM appears to be unresponsive to the RM and the RM kills the AM attempt.

      1. MAPREDUCE-4813-2-branch-0.23.patch
        132 kB
        Jason Lowe
      2. MAPREDUCE-4813-2.patch
        138 kB
        Jason Lowe
      3. MAPREDUCE-4813-2.patch
        138 kB
        Jason Lowe
      4. MAPREDUCE-4813-2.patch
        138 kB
        Jason Lowe
      5. MAPREDUCE-4813-2.patch
        138 kB
        Jason Lowe
      6. MAPREDUCE-4813.patch
        13 kB
        Jason Lowe
      7. MAPREDUCE-4813.patch
        25 kB
        Jason Lowe
      8. MAPREDUCE-4813.patch
        31 kB
        Jason Lowe
      9. JobImplStateMachine.pdf
        42 kB
        Jason Lowe

        Issue Links

          Activity

            People

            • Assignee:
              Jason Lowe
              Reporter:
              Jason Lowe
            • Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development