Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.23.3
-
None
-
None
Description
DistcpV2 (hadoop-tools/hadoop-distcp/..) can fail from checksum failure, sometimes when copying a 0 byte file. Root cause of this may have to do with an inconsistent nature of HDFS when creating 0 byte files, however distcp can avoid this issue by not checking CRC when size is zero.
Further, distcp fails checksum when copying from two clusters that use different blocksizes. In this case it does not make sense to check CRC, as it is a guaranteed failure.
We need to turn CRC checking off for the above two cases.
Attachments
Attachments
Issue Links
- is depended upon by
-
HADOOP-15788 Improve Distcp for long-haul/cloud deployments
- Open