Affects Version/s: 2.8.0
Fix Version/s: None
I have something running in production that is running up on ulimit trying to create s3a-transfer-unbounded threads.
Relevant background: https://issues.apache.org/jira/browse/HADOOP-13826.
Before that change, the thread pool used in the TransferManager had both a reasonably small maximum pool size and work queue capacity.
After that change, the thread pool has both a maximum pool size and work queue capacity of Integer.MAX_VALUE.
This seems like a pretty bad idea, because now we have, practically speaking, no bound on the number of threads that might get created. I understand the change was made in response to experiencing deadlocks and at the warning of the documentation, which I will repeat here:
It is not recommended to use a single threaded executor or a thread pool with a bounded work queue as control tasks may submit subtasks that can't complete until all sub tasks complete. Using an incorrectly configured thread pool may cause a deadlock (I.E. the work queue is filled with control tasks that can't finish until subtasks complete but subtasks can't execute because the queue is filled).
The documentation only warns against having a bounded work queue, not against having a bounded maximum pool size. And this seems fine, as having an unbounded work queue sounds ok. Having an unbounded maximum pool size, however, does not.
I will also note that this constructor is now deprecated and suggests using TransferManagerBuilder instead, which by default creates a fixed thread pool of size 10: https://github.com/aws/aws-sdk-java/blob/1.11.534/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/transfer/internal/TransferManagerUtils.java#L59.
I suggest we make a small change here and keep the maximum pool size at maxThreads, which defaults to 10, while keeping the work queue as is (unbounded).