Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-15620 Über-jira: S3A phase VI: Hadoop 3.3 features
  3. HADOOP-13407

s3a directory housekeeping operations to be done in async thread

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.8.0, 3.1.0, 2.9.1, 3.0.3
    • Fix Version/s: None
    • Component/s: fs/s3
    • Labels:
      None

      Description

      Some of the delays on s3a calls are due to cleaning up parent pseudo directories; repeated getParent/GET calls to look for the entries, then to delete them.

      We could possibly make this asynchronous; the core semantics would be retained, just the cleanup delayed.

      Risks?

      1. while the cleanup is in progress, getFileStatus of parent dirs could imply that the parent dir is still empty
      2. failure
        of course, these risks exist today. We really need an s3a fsck

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                stevel@apache.org Steve Loughran
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: