Details
Description
Various objects stores provide etags in their FileStatus implementations
Make these values accessible
- new interface EtagFromFileStatus to be implemented when provided
- filesystem.md to declare requirements of etags (constant between LIST and HEAD)...
- path capabilities for (a) etag and (b) etags consistent across rename
Add implementation for abfs, later s3a (and google gcs)
This is initially to handle recovery from certain failures in job commit against abfs, but it would allow a cloud-ready version of distcp to track etags of uploaded files, so diff properly.
Attachments
Issue Links
- blocks
-
MAPREDUCE-7341 Add a task-manifest output committer for Azure and GCS
- Resolved
- causes
-
HADOOP-18075 ABFS: Fix failure caused by listFiles() in ITestAbfsRestOperationException
- Resolved
- incorporates
-
HADOOP-17492 abfs listLocatedStatus to support incremental/async page fetching
- Resolved
- is related to
-
HADOOP-18012 ABFS: Enable config controlled ETag check for Rename idempotency
- Resolved
-
HADOOP-18425 [ABFS]: RenameFilePath Source File Not Found (404) error in retry loop
- Resolved
-
HADOOP-17981 Support etag-assisted renames in FileOutputCommitter
- Resolved
-
HADOOP-18672 ask: abfs connector to support checksum
- Resolved
- relates to
-
HADOOP-18002 abfs rename idempotency broken -remove recovery
- Resolved
- links to