Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum even some of striped blocks are missed, we need to consider recomputing block checksum on the fly for the missed/corrupt blocks. To recompute the block checksum, the block data needs to be reconstructed by erasure decoding, and the main needed codes for the block reconstruction could be borrowed from HDFS-9719, the refactoring of the existing ErasureCodingWorker. In EC worker, reconstructed blocks need to be written out to target datanodes, but here in this case, the remote writing isn't necessary, as the reconstructed block data is only used to recompute the checksum.
Attachments
Attachments
Issue Links
- depends upon
-
HDFS-9694 Make existing DFSClient#getFileChecksum() work for striped blocks
- Resolved
- Is contained by
-
HDFS-8430 Erasure coding: compute file checksum for striped files (stripe by stripe)
- In Progress
- relates to
-
HDFS-10461 Erasure Coding: Optimize block checksum recalculation logic on the fly by reconstructing multiple missed blocks at a time
- Open
-
HDFS-10460 Erasure Coding: Recompute block checksum for a particular range less than file size on the fly by reconstructing missed block
- Resolved