Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15087

S3A to support writing directly to the destination dir without creating temp directory to avoid rename

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.0.0
    • None
    • fs/s3
    • None

    Description

      Rename in workloads like Teragen/Terasort who use Hadoop default outputcommitters really hurt performance a lot.
      Stocator announce it doesn't create the temporary directories any all, and still preserves Hadoop's fault tolerance. I add a switch when creating file via integrating it's code into s3a, I got 5x performance gain in Teragen and 15% performance improvement in Terasort.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              iyonger Yonger
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: