Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6336

Enable v2 FileOutputCommitter by default

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7.0
    • 3.0.0-alpha1
    • mrv2
    • Incompatible change, Reviewed
    • Hide
      mapreduce.fileoutputcommitter.algorithm.version now defaults to 2.
        
      In algorithm version 1:

        1. commitTask renames directory
        $joboutput/_temporary/$appAttemptID/_temporary/$taskAttemptID/
        to
        $joboutput/_temporary/$appAttemptID/$taskID/

        2. recoverTask renames
        $joboutput/_temporary/$appAttemptID/$taskID/
        to
        $joboutput/_temporary/($appAttemptID + 1)/$taskID/

        3. commitJob merges every task output file in
        $joboutput/_temporary/$appAttemptID/$taskID/
        to
        $joboutput/, then it will delete $joboutput/_temporary/
        and write $joboutput/_SUCCESS

      commitJob's run time, number of RPC, is O(n) in terms of output files, which is discussed in MAPREDUCE-4815, and can take minutes.

      Algorithm version 2 changes the behavior of commitTask, recoverTask, and commitJob.

        1. commitTask renames all files in
        $joboutput/_temporary/$appAttemptID/_temporary/$taskAttemptID/
        to $joboutput/

        2. recoverTask is a nop strictly speaking, but for
        upgrade from version 1 to version 2 case, it checks if there
        are any files in
        $joboutput/_temporary/($appAttemptID - 1)/$taskID/
        and renames them to $joboutput/

        3. commitJob deletes $joboutput/_temporary and writes
        $joboutput/_SUCCESS

      Algorithm 2 takes advantage of task parallelism and makes commitJob itself O(1). However, the window of vulnerability for having incomplete output in $jobOutput directory is much larger. Therefore, pipeline logic for consuming job outputs should be built on checking for existence of _SUCCESS marker.
      Show
      mapreduce.fileoutputcommitter.algorithm.version now defaults to 2.    In algorithm version 1:   1. commitTask renames directory   $joboutput/_temporary/$appAttemptID/_temporary/$taskAttemptID/   to   $joboutput/_temporary/$appAttemptID/$taskID/   2. recoverTask renames   $joboutput/_temporary/$appAttemptID/$taskID/   to   $joboutput/_temporary/($appAttemptID + 1)/$taskID/   3. commitJob merges every task output file in   $joboutput/_temporary/$appAttemptID/$taskID/   to   $joboutput/, then it will delete $joboutput/_temporary/   and write $joboutput/_SUCCESS commitJob's run time, number of RPC, is O(n) in terms of output files, which is discussed in MAPREDUCE-4815 , and can take minutes. Algorithm version 2 changes the behavior of commitTask, recoverTask, and commitJob.   1. commitTask renames all files in   $joboutput/_temporary/$appAttemptID/_temporary/$taskAttemptID/   to $joboutput/   2. recoverTask is a nop strictly speaking, but for   upgrade from version 1 to version 2 case, it checks if there   are any files in   $joboutput/_temporary/($appAttemptID - 1)/$taskID/   and renames them to $joboutput/   3. commitJob deletes $joboutput/_temporary and writes   $joboutput/_SUCCESS Algorithm 2 takes advantage of task parallelism and makes commitJob itself O(1). However, the window of vulnerability for having incomplete output in $jobOutput directory is much larger. Therefore, pipeline logic for consuming job outputs should be built on checking for existence of _SUCCESS marker.

    Description

      This JIRA is to propose making new FileOutputCommitter behavior from MAPREDUCE-4815 enabled by default in trunk, and potentially in branch-2.

      Attachments

        1. MAPREDUCE-6336.v1.patch
          0.8 kB
          Siqi Li

        Issue Links

          Activity

            People

              l201514 Siqi Li
              jira.shegalov Gera Shegalov
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: