Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-10610

Upgrade S3n fs.s3.buffer.dir to support multi directories

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 2.4.0
    • 2.6.0
    • fs/s3
    • None
    • Reviewed

    Description

      fs.s3.buffer.dir defines the tmp folder where files will be written to before getting sent to S3. Right now this is limited to a single folder which causes to major issues.

      1. You need a drive with enough space to store all the tmp files at once
      2. You are limited to the IO speeds of a single drive

      This solution will resolve both and has been tested to increase the S3 write speed by 2.5x with 10 mappers on hs1.

      Attachments

        1. HADOOP_10610-2.patch
          2 kB
          Theodore michael Malaska
        2. HADOOP-10610.patch
          2 kB
          Theodore michael Malaska
        3. HDFS-6383.patch
          1 kB
          Theodore michael Malaska

        Issue Links

          Activity

            People

              ted.m Theodore michael Malaska
              ted.m Theodore michael Malaska
              Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: