Description
If S3AFileSystem does an S3 LIST restricted to a single object to see if a directory is empty, and the single entry found has a tombstone marker (either from an inconsistent DDB Table or from an eventually consistent LIST) then it will consider the directory empty, even if there is 1+ entry which is not deleted
We need to make sure the calculation of whether a directory is empty or not is resilient to this, efficiently.
It surfaces as an issue two places
- delete(path) (where it may make things worse)
- rename(src, dest), where a check is made for dest != an empty directory.
Attachments
Issue Links
- blocks
-
HADOOP-14936 S3Guard: remove "experimental" from documentation
- Resolved
- is caused by
-
HADOOP-16279 S3Guard: Implement time-based (TTL) expiry for entries (and tombstones)
- Resolved
- is related to
-
HADOOP-16427 Downgrade INFO message on rm s3a root dir to DEBUG
- Resolved
- relates to
-
HADOOP-15183 S3Guard store becomes inconsistent after partial failure of rename
- Resolved
-
HADOOP-16384 S3A: Avoid inconsistencies between DDB and S3
- Resolved
- links to