Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2196

Failures of concurrent block transactions can corrupt on-disk consistency

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.5.0
    • Fix Version/s: None
    • Component/s: fs
    • Labels:
      None

      Description

      Failures of concurrent multiple block transactions can potentially corrupt the underlying log block container.

      Currently, a log block container can be made available to any uncommitted writers (block transactions) once the written block is finalized, thus allowing concurrent writers to share the same log block container.

      While committing block transactions, the container will be marked as read-only if encountered any failures to maintain on-disk consistency. However, this prevention mechanism cannot help when concurrent writers go into the commitment state at the same time. If one transaction fail, the other transactions are still in the process of committing without knowing the container should be read-only now. This could let partial metadata record persist on disk and follow with full records, especially if the failure is transient (e.g ENOSPC). Thus, leaving the container in an unrecoverable state.

      More detail and proposed solution can be found in the attached doc.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                hahao Hao Hao
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: