Details
Description
Use distcp with -prbugpcaxt and -delete to copy data between cluster.
hadoop distcp -Dmapreduce.job.queuename="QueueA" -prbugpcaxt -update -delete hdfs://sourcecluster/user/hive/warehouse/sum.db hdfs://destcluster/user/hive/warehouse/sum.db
After distcp, we found the timestamp of dest is different from source, and the timestamp of some directory was the time distcp running.
Check the code of distcp, in CopyCommitter, it preserves time first then process -delete option which will change the timestamp of dest directory. So we should process -delete option first.