Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-14239

S3A Retry Multiple S3 Key Deletion

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.8.0
    • None
    • fs/s3
    • None
    • EC2, AWS

    Description

      When fs.s3a.multiobjectdelete.enable == true, It tries to delete multiple S3 keys at once.

      Although this is a great feature, it becomes problematic when AWS fails deleting some S3 keys out of the deletion list. The aws-java-sdk internally retries to delete them, but it does not help because it simply retries the same list of S3 keys including the successfully deleted ones. In that case, all successive retries fail deleting previously deleted keys since they do not exist any more. Eventually it throws an Exception and leads to a job failure entirely.

      Luckily, the AWS API reports which keys it failed to delete. We should retry only for the keys that failed to be deleted from S3A

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              kazuyukitanimura Kazuyuki Tanimura
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: