Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-11446

S3AOutputStream should use shared thread pool to avoid OutOfMemoryError

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.6.0
    • Fix Version/s: 2.7.0
    • Component/s: fs/s3
    • Labels:
      None
    • Target Version/s:
    • Release Note:
      Hide
      The following parameters are introduced in this JIRA:
      fs.s3a.threads.max: the maximum number of threads to allow in the pool used by TransferManager
      fs.s3a.threads.core: the number of threads to keep in the pool used by TransferManager
      fs.s3a.threads.keepalivetime: when the number of threads is greater than the core, this is the maximum time that excess idle threads will wait for new tasks before terminating
      fs.s3a.max.total.tasks: the maximum number of tasks that the LinkedBlockingQueue can hold
      Show
      The following parameters are introduced in this JIRA: fs.s3a.threads.max: the maximum number of threads to allow in the pool used by TransferManager fs.s3a.threads.core: the number of threads to keep in the pool used by TransferManager fs.s3a.threads.keepalivetime: when the number of threads is greater than the core, this is the maximum time that excess idle threads will wait for new tasks before terminating fs.s3a.max.total.tasks: the maximum number of tasks that the LinkedBlockingQueue can hold

      Description

      When working with Terry Padgett who used s3a for hbase snapshot, the following issue was uncovered.
      Here is part of the output including the OOME when hbase snapshot is exported to s3a (nofile ulimit was increased to 102400):

      2014-12-19 13:15:03,895 INFO  [main] s3a.S3AFileSystem: OutputStream for key 'FastQueryPOC/2014-12-11/EVENT1-IDX-snapshot/.hbase-snapshot/.tmp/EVENT1_IDX_snapshot_2012_12_11/    650a5678810fbdaa91809668d11ccf09/.regioninfo' closed. Now beginning upload
      2014-12-19 13:15:03,895 INFO  [main] s3a.S3AFileSystem: Minimum upload part size: 16777216 threshold2147483647
      Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
              at java.lang.Thread.start0(Native Method)
              at java.lang.Thread.start(Thread.java:713)
              at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
              at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360)
              at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:132)
              at com.amazonaws.services.s3.transfer.internal.UploadMonitor.<init>(UploadMonitor.java:129)
              at com.amazonaws.services.s3.transfer.TransferManager.upload(TransferManager.java:449)
              at com.amazonaws.services.s3.transfer.TransferManager.upload(TransferManager.java:382)
              at org.apache.hadoop.fs.s3a.S3AOutputStream.close(S3AOutputStream.java:127)
              at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
              at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
              at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:54)
              at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
              at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
              at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:356)
              at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:356)
              at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
              at org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:791)
              at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
              at org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:882)
              at org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:886)
      

      In S3AOutputStream#close():

            TransferManager transfers = new TransferManager(client);
      

      This results in each TransferManager creating its own thread pool, leading to the OOME.
      One solution is to pass shared thread pool to TransferManager.

        Attachments

        1. hadoop-11446.addendum
          2 kB
          Ted Yu
        2. hadoop-11446-001.patch
          7 kB
          Ted Yu
        3. hadoop-11446-002.patch
          9 kB
          Ted Yu
        4. hadoop-11446-003.patch
          10 kB
          Ted Yu

          Issue Links

            Activity

              People

              • Assignee:
                yuzhihong@gmail.com Ted Yu
                Reporter:
                yuzhihong@gmail.com Ted Yu
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: