Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-423 Implement scalable and performant on-disk storage
  3. KUDU-668

Log block container metadata files should be more forgiving to truncation

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • M5
    • None
    • fs
    • None

    Description

      Log block container metadata files are resilient to many different kinds of failures (see pb_util.h for details). However, they are also overly strict with respect to truncation. Ideally, a truncation in the middle of a log block record should result in the record being discarded and the container reused for additional writes. The only way to do this safely is to prove that, between the truncation and the end of the file, there do not exist any other valid log block records. The WAL segment reader code has the same problem, and it handles this by trying to decode a segment header at every byte position between the point of truncation and the end of the file. Log block container metadata files should do the same thing.

      Here's what needs to happen:

      1. Containerized PB files should add a CRC32 checksum to the message header structure. Otherwise we can't tell if a particular read in the file comprises a "valid" message header.
      2. In the event of truncation, they should do what the WAL segment reader does and scan ahead in the file looking for valid message headers. If one is found, this is not truncation but corruption, and is unrecoverable. If none are found (or if the remainder of the file is all zeroes), it's recoverable truncation.
      3. If the truncation is recoverable, we should make sure to start writing new metadata at the point of truncation, not at the end of the file.

      Once this is done, containerized PB files will be almost identical to WAL segments, and we could consider merging the two. As far as I can tell, the only remaining major difference is that WAL segments allow one to write different kinds of PB messages, while containerized PB files are restricted to one type of PB message per file.

      For the time being, log block container metadata files don't use memory mapped writing or preallocation, so that the likelihood of extra zeroes in the file is low. Still, if we believe that the underlying filesystem or disk could truncate the file unexpectedly, we will consider such truncation fatal instead of recovering gracefully.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              adar Adar Dembo
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: