Currently Distcp uploads a file by two strategies
- append parts
- copy to temp then rename
option 2 executes the following sequence in promoteTmpToTarget
For any object store, that's a lot of HTTP requests; for S3A you are looking at 12+ requests and an O(data) copy call.
This is not a good upload strategy for any store which manifests its output atomically at the end of the write().
Proposed: add a switch to write directly to the dest path, which can be supplied as either a conf option (distcp.direct.write = true) or a CLI option (-direct).