Details
-
Sub-task
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Hive uses 'distcp' to copy files in parallel between HDFS encryption zones when the hive.exec.copyfile.maxsize threshold is lower than the file to copy. This 'distcp' is also executed when copying to S3, but it is causing slower copies.
We should not invoke distcp when copying to blobstore systems.