Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6823

FileOutputFormat to support configurable PathOutputCommitter factory

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 3.0.0-alpha2
    • Fix Version/s: None
    • Component/s: mrv2
    • Labels:
      None
    • Environment:

      Targeting S3 as the output of work

    • Target Version/s:

      Description

      In HADOOP-13786 I'm adding a custom subclass for FileOutputFormat, one which can talk direct to the S3A Filesystem for more efficient operations, better failure modes, and, most critically, as part of HADOOP-13345, atomic commit of output. The normal committer relies on directory rename() being atomic for this; for S3 we don't have that luxury.

      To support a custom committer, we need to be able to tell FileOutputFormat (and implicitly, all subclasses which don't have their own custom committer), to use our new S3AOutputCommitter.

      I propose:

      1. FileOutputFormat takes a factory to create committers.
      2. The factory to take a URI and TaskAttemptContext and return a committer
      3. the default implementation always returns a FileOutputCommitter
      4. A configuration option allows a new factory to be named
      5. An S3AOutputCommitterFactory to return a FileOutputCommitter or new S3AOutputCommitter depending upon the URI of the destination.

      Note that MRv1 already supports configurable committers; this is only the V2 API

        Attachments

        1. HADOOP-13786-HADOOP-13345-001.patch
          126 kB
          Steve Loughran
        2. MAPREDUCE-6823-002.patch
          43 kB
          Steve Loughran
        3. MAPREDUCE-6823-002.patch
          43 kB
          Steve Loughran
        4. MAPREDUCE-6823-004.patch
          46 kB
          Steve Loughran

          Issue Links

            Activity

              People

              • Assignee:
                stevel@apache.org Steve Loughran
                Reporter:
                stevel@apache.org Steve Loughran
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: