Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2196

Failures of concurrent block transactions can corrupt on-disk consistency

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.5.0
    • None
    • fs
    • None

    Description

      Failures of concurrent multiple block transactions can potentially corrupt the underlying log block container.

      Currently, a log block container can be made available to any uncommitted writers (block transactions) once the written block is finalized, thus allowing concurrent writers to share the same log block container.

      While committing block transactions, the container will be marked as read-only if encountered any failures to maintain on-disk consistency. However, this prevention mechanism cannot help when concurrent writers go into the commitment state at the same time. If one transaction fail, the other transactions are still in the process of committing without knowing the container should be read-only now. This could let partial metadata record persist on disk and follow with full records, especially if the failure is transient (e.g ENOSPC). Thus, leaving the container in an unrecoverable state.

      More detail and proposed solution can be found in the attached doc.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hahao Hao Hao
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: