Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
It's somewhat troublesome to examine the contents of a log block container today. Tooling exists in the form of kudu pbc dump for metadata and hexdump for data, but it'd be nice to have more specialized tooling for examining containers to understand things like:
- What blocks are in this container? When was each block last updated? You can piece this together from the kudu pbc dump on the metadata, but having something more tabular might be nice.
- Does each block actually contain any data? If not, which don't?
- Does each block have a valid header if it were a CFile block?
Some of the information I'd like to get at falls out of the purview of the log block manager itself, and requires information like what kind of blocks we're dealing with. But the underlying struggle I'd like to address is: given a container, can we be more rigorous about our checks that the data is OK, and flag blocks that appear broken?
The context of this was a (Kudu version 1.5.x) case in which some form of corruption occurred, and we were left with containers that appeared to have holes punched out of them, resulting in messages complaining about bad CFile header magic values of "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" (vs the expected "kuducfl2"). The log block metadata and tablet metadata both had records of many blocks, but the corresponding locations in the data files were all zeroes. It's unclear how this happened, but even just examining the containers and blocks therein was not well-documented.