[HADOOP-11446] S3AOutputStream should use shared thread pool to avoid OutOfMemoryError - ASF JIRA

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.6.0
Fix Version/s: 2.7.0
Component/s: fs/s3
Labels:
None

Target Version/s:

2.7.0
Release Note:

Hide
The following parameters are introduced in this JIRA:
fs.s3a.threads.max: the maximum number of threads to allow in the pool used by TransferManager
fs.s3a.threads.core: the number of threads to keep in the pool used by TransferManager
fs.s3a.threads.keepalivetime: when the number of threads is greater than the core, this is the maximum time that excess idle threads will wait for new tasks before terminating
fs.s3a.max.total.tasks: the maximum number of tasks that the LinkedBlockingQueue can hold

Show
The following parameters are introduced in this JIRA: fs.s3a.threads.max: the maximum number of threads to allow in the pool used by TransferManager fs.s3a.threads.core: the number of threads to keep in the pool used by TransferManager fs.s3a.threads.keepalivetime: when the number of threads is greater than the core, this is the maximum time that excess idle threads will wait for new tasks before terminating fs.s3a.max.total.tasks: the maximum number of tasks that the LinkedBlockingQueue can hold

Description

When working with Terry Padgett who used s3a for hbase snapshot, the following issue was uncovered.
Here is part of the output including the OOME when hbase snapshot is exported to s3a (nofile ulimit was increased to 102400):

2014-12-19 13:15:03,895 INFO  [main] s3a.S3AFileSystem: OutputStream for key 'FastQueryPOC/2014-12-11/EVENT1-IDX-snapshot/.hbase-snapshot/.tmp/EVENT1_IDX_snapshot_2012_12_11/    650a5678810fbdaa91809668d11ccf09/.regioninfo' closed. Now beginning upload
2014-12-19 13:15:03,895 INFO  [main] s3a.S3AFileSystem: Minimum upload part size: 16777216 threshold2147483647
Exception in thread "main" java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:713)
        at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360)
        at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:132)
        at com.amazonaws.services.s3.transfer.internal.UploadMonitor.<init>(UploadMonitor.java:129)
        at com.amazonaws.services.s3.transfer.TransferManager.upload(TransferManager.java:449)
        at com.amazonaws.services.s3.transfer.TransferManager.upload(TransferManager.java:382)
        at org.apache.hadoop.fs.s3a.S3AOutputStream.close(S3AOutputStream.java:127)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
        at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:54)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:356)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:356)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
        at org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:791)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:882)
        at org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:886)

In S3AOutputStream#close():

      TransferManager transfers = new TransferManager(client);

This results in each TransferManager creating its own thread pool, leading to the OOME.
One solution is to pass shared thread pool to TransferManager.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hadoop-11446-001.patch
24/Dec/14 22:49
7 kB
Ted Yu
hadoop-11446-002.patch
28/Dec/14 22:58
9 kB
Ted Yu
hadoop-11446-003.patch
30/Dec/14 18:04
10 kB
Ted Yu
hadoop-11446.addendum
05/Jan/15 15:04
2 kB
Ted Yu

Issue Links

is depended upon by

HADOOP-11571 Über-jira: S3a stabilisation phase I

Closed

S3AOutputStream should use shared thread pool to avoid OutOfMemoryError

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates