Details
-
Improvement
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
3.2.0, 3.1.1, 3.3.0
-
None
-
None
Description
CopyMapper#setup
... try { overWrite = overWrite || targetFS.getFileStatus(targetFinalPath).isFile(); } catch (FileNotFoundException ignored) { } ...
The above code overrides config key "overWrite" to "true" when the target path is a file. Therefore, unnecessary transfer happens when the source and target file have the same checksums.
My suggestion is: remove the code above. If the user insists to overwrite, just add -overwrite in the options:
DistCp command with -overwrite option
hadoop distcp -overwrite hdfs://localhost:64464/source/5/6.txt hdfs://localhost:64464/target/5/6.txt