Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13826

S3A Deadlock in multipart copy due to thread pool limits.

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.7.3
    • Fix Version/s: 2.8.0, 3.0.0-alpha4
    • Component/s: fs/s3
    • Labels:
      None
    • Target Version/s:

      Description

      In testing HIVE-15093 we have encountered deadlocks in the s3a connector. The TransferManager javadocs (http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html) explain how this is possible:

      It is not recommended to use a single threaded executor or a thread pool with a bounded work queue as control tasks may submit subtasks that can't complete until all sub tasks complete. Using an incorrectly configured thread pool may cause a deadlock (I.E. the work queue is filled with control tasks that can't finish until subtasks complete but subtasks can't execute because the queue is filled).

        Attachments

        1. HADOOP-13206-branch-2-005.patch
          10 kB
          Steve Loughran
        2. HADOOP-13826.001.patch
          5 kB
          Sean Mackrory
        3. HADOOP-13826.002.patch
          16 kB
          Sean Mackrory
        4. HADOOP-13826.003.patch
          8 kB
          Sean Mackrory
        5. HADOOP-13826.004.patch
          9 kB
          Sean Mackrory
        6. HADOOP-13826-branch-2-006.patch
          10 kB
          Steve Loughran
        7. HADOOP-13826-branch-2-007.patch
          10 kB
          Steve Loughran

          Issue Links

            Activity

              People

              • Assignee:
                mackrorysd Sean Mackrory
                Reporter:
                mackrorysd Sean Mackrory
              • Votes:
                0 Vote for this issue
                Watchers:
                14 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: