Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
2.5.0
-
None
Description
Distcp currently uses positioned-reads (in RetriableFileCopyCommand#copyBytes) when the source offset is > 0. This results in unnecessary overheads (new BlockReader being created on the client-side, multiple readBlock() calls to the Datanodes, each of which requires the creation of a BlockSender and an inputstream to the ReplicaInfo).
Attachments
Attachments
Issue Links
- is related to
-
HDFS-9146 HDFS forward seek() within a block shouldn't spawn new TCP Peer/RemoteBlockReader
- Open
- is required by
-
HADOOP-16049 DistCp result has data and checksum mismatch when blocks per chunk > 0
- Resolved
- relates to
-
HADOOP-15209 DistCp to eliminate needless deletion of files under already-deleted directories
- Resolved
-
MAPREDUCE-5899 Support incremental data copy in DistCp
- Closed