Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.5.2
-
None
Description
While riding over a spat of HDFS data corruption issues, we've observed several places in the read path that do not fall back to HDFS checksum appropriately. These failures manifest during client reads and during compactions. Sometimes failure is detected by the fallback verifyOnDiskSizeMatchesHeader, sometimes we attempt to allocate a buffer with a negative size, and sometimes we read through to a failure from block decompression.
After code study, I think that all three cases arise from using a block header that was read without checksum validation.
Will post up the stack traces in the comments. Not sure if we'll want a single patch or multiple.