Description
As promised to StephanEwen: add and s3a-specific option to the builder-API to create files for all existence checks to be skipped.
This
- eliminates a few hundred milliseconds
avoids any caching of negative HEAD/GET responses in the S3 load balancers.
Callers will be expected to know what what they are doing.
FWIW, we are doing some PUT calls in the committer which bypass this stuff, for the same reason. If you've just created a directory, you know there's nothing underneath, so no need to check.
adding this inside HADOOP-17833 as we are effectively doing this under the magic dir tree. having it as an option and using it to save all manifests/success files also saves one LIST per manifest write (task commit) and the LIST when saving a _SUCCESS file.
Attachments
Issue Links
- Is contained by
-
HADOOP-17833 Improve Magic Committer Performance
- Resolved
- is duplicated by
-
HADOOP-18278 Do not perform a LIST call when creating a file
- Resolved
- is related to
-
BEAM-5934 FileSink affected by S3 eventual consistency
- Open
- is superceded by
-
HADOOP-16490 Avoid/handle cached 404s during S3A file creation
- Resolved
- relates to
-
HADOOP-15525 s3a: clarify / improve support for mixed ACL buckets
- Open