Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.9.0
Description
Users of s3a may not realize that, in some cases, it does not interoperate well with other s3 tools, such as the AWS CLI. (See HIVE-13778, IMPALA-3558).
Specifically, if a user:
- Creates an empty directory with hadoop fs -mkdir s3a://bucket/path
- Copies data into that directory via another tool, i.e. aws cli.
- Tries to access the data in that directory with any Hadoop software.
Then the last step fails because the fake empty directory blob that s3a wrote in the first step, causes s3a (listStatus() etc.) to continue to treat that directory as empty, even though the second step was supposed to populate the directory with data.
I wanted to document this fact for users. We may mark this as not-fix, "by design".. May also be interesting to brainstorm solutions and/or a config option to change the behavior if folks care.
Attachments
Attachments
Issue Links
- breaks
-
HADOOP-17403 S3A ITestPartialRenamesDeletes.testRenameDirFailsInDelete failure: missing directory marker
- Resolved
- causes
-
HADOOP-17244 HADOOP-17244. S3A directory delete tombstones dir markers prematurely.
- Resolved
-
HADOOP-17261 s3a rename() now requires s3:deleteObjectVersion permission
- Resolved
-
HADOOP-17293 S3A to always probe S3 in S3A getFileStatus on non-auth paths
- Resolved
- contains
-
HADOOP-17200 Renaming a file under a sibling empty directory doesn't delete dest dir's marker
- Resolved
-
HADOOP-13430 Optimize getFileStatus in S3A
- Resolved
-
HADOOP-16493 S3AFilesystem.initiateRename() can skip check on dest.parent status if src has same parent
- Resolved
- Dependency
-
HADOOP-17200 Renaming a file under a sibling empty directory doesn't delete dest dir's marker
- Resolved
- fixes
-
HADOOP-17217 S3A FileSystem does not correctly delete directories with fake entries
- Resolved
- is depended upon by
-
HADOOP-17217 S3A FileSystem does not correctly delete directories with fake entries
- Resolved
-
SPARK-35299 Dataframe overwrite on S3A does not delete old files with S3 object-put to table path/
- Resolved
- is duplicated by
-
HADOOP-16846 add experimental optimization of s3a directory marker handling
- Resolved
-
HADOOP-16942 S3A creating folder level delete markers
- Resolved
- is related to
-
HADOOP-14255 S3A to delete unnecessary fake directory objects in mkdirs()
- Resolved
-
HADOOP-16804 s3a mkdir path/ can add 404 to S3 load balancers
- Resolved
-
HADOOP-17199 Backport HADOOP-13230 list/getFileStatus changes for preserved directory markers
- Resolved
-
HADOOP-17227 improve s3guard markers command line tool
- Resolved
-
HADOOP-17228 Backport HADOOP-13230 listing changes for preserved directory markers to 3.1.x
- Resolved
-
HADOOP-18752 Change fs.s3a.directory.marker.retention to "keep"
- Resolved
-
HADOOP-14124 S3AFileSystem silently deletes "fake" directories when writing a file.
- Resolved
-
HADOOP-17359 [Hadoop-Tools]S3A MultiObjectDeleteException after uploading a file
- Resolved
- relates to
-
IMPALA-3558 DROP TABLE PURGE on S3A table may not delete externally written files
- Resolved
-
HADOOP-13164 Optimize S3AFileSystem::deleteUnnecessaryFakeDirectories
- Resolved
- supercedes
-
HADOOP-16090 S3A Client to add explicit support for versioned stores
- Resolved
- links to