Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-4912

Investigate ways to clean up double job commit prevention

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • mrv2
    • None

    Description

      Once MAPREDUCE-4819 goes in it fixes the issue where an OutputCommiter can double commit a job. So that the output will never be touched after the job informs externally of success or failure.

      The code and design could potentially use some cleanup and refactoring.

      Issues brought up that should be investigated include:

      1. reporting KILL for killed jobs if they crash after the kill happens instead of error.
      2. using the job history log for recording the commit status instead of separate external files in HDFS.
      3. Placing the recovery/retry logic in the commit handler instead of the MRAppMaster, and having the recovery service replay the logs as it normally does for recovery.

      This is not meant to be things that must be done, but alternatives that might clean up the code.

      Attachments

        Activity

          People

            Unassigned Unassigned
            revans2 Robert Joseph Evans
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: