Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
Currently, the various checksum types only provide the capability to calculate the checksum of a range of a byte array. For HDFS-2080, it's advantageous to provide an API that, given a buffer with some number of "checksum chunks", can either calculate or verify the checksums of all of the chunks. For example, given a 4KB buffer and a 512-byte chunk size, it would calculate or verify 8 CRC32s in one call.
This allows efficient JNI-based checksum implementations since the cost of crossing the JNI boundary is amortized across many computations.
Attachments
Attachments
Issue Links
- blocks
-
HDFS-2129 Simplify BlockReader to not inherit from FSInputChecker
- Closed
-
HADOOP-7443 Add CRC32C as another DataChecksum implementation
- Closed
-
HADOOP-7445 Implement bulk checksum verification using efficient native code
- Closed
- is part of
-
HDFS-2080 Speed up DFS read path by lessening checksum overhead
- Closed