Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-10610

Upgrade S3n fs.s3.buffer.dir to support multi directories

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 2.6.0
    • Component/s: fs/s3
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      fs.s3.buffer.dir defines the tmp folder where files will be written to before getting sent to S3. Right now this is limited to a single folder which causes to major issues.

      1. You need a drive with enough space to store all the tmp files at once
      2. You are limited to the IO speeds of a single drive

      This solution will resolve both and has been tested to increase the S3 write speed by 2.5x with 10 mappers on hs1.

        Attachments

        1. HDFS-6383.patch
          1 kB
          Theodore michael Malaska
        2. HADOOP-10610.patch
          2 kB
          Theodore michael Malaska
        3. HADOOP_10610-2.patch
          2 kB
          Theodore michael Malaska

          Issue Links

            Activity

              People

              • Assignee:
                ted.m Theodore michael Malaska
                Reporter:
                ted.m Theodore michael Malaska
              • Votes:
                1 Vote for this issue
                Watchers:
                11 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: