Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4831

Task commit can occur more than once due to AM retries

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Not A Problem
    • Affects Version/s: 0.23.0, 2.0.0-alpha
    • Fix Version/s: None
    • Component/s: mr-am
    • Labels:
      None

      Description

      If a task attempt begins committing but the AM crashes before the task attempt completes then we could end up having the task commit again when the AM is relaunched. The subsequent AM attempt will not see the task having completed, so it will re-run the task and it will commit again. The output committer is user code, and the task commit may not be something repeatable. Therefore we should treat an AM crash during a task attempt commit the same as we do for a commit failure by the task attempt, i.e.: the task should fail since we do not know how to recover from a commit failure.

      This is similar to MAPREDUCE-4819, as this involves commit at the task level and that involves commit at the job-level.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                jlowe Jason Darrell Lowe
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: