Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-9777

Purgatory locking bug can lead to hanging transaction

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.1.1, 2.0.1, 2.1.1, 2.2.2, 2.3.1, 2.4.1
    • Fix Version/s: 2.6.0
    • Component/s: None
    • Labels:
      None

      Description

      Once a transaction reaches the `PrepareCommit` or `PrepareAbort` state, the transaction coordinator must send markers to all partitions included in the transaction. After all markers have been sent, then the transaction transitions to the corresponding completed state. Until this transition occurs, no additional progress can be made by the producer.

      The transaction coordinator uses a purgatory to track completion of the markers that need to be sent. Once all markers have been written, then the `DelayedTxnMarker` task becomes completable. We depend on its completion in order to transition to the completed state.

      Related to KAFKA-8334, there is a bug in the locking protocol which is used to check completion of the `DelayedTxnMarker` task. The purgatory attempts to provide a "happens before" contract for task completion with `checkAndComplete`. Basically if a task is completed before calling `checkAndComplete`, then it should be given an opportunity to complete as long as there is sufficient time remaining before expiration.

      The bug in the locking protocol is that it expects that the operation lock is exclusive to the operation. See here: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/DelayedOperation.scala#L114. The logic assumes that if the lock cannot be acquired, then the other holder of the lock must be attempting completion of the same delayed operation. If that is not the case, then the "happens before" contract is broken and a task may not get completed until expiration even if it has been satisfied.

      In the case of `DelayedTxnMarker`, the lock in use is the read side of a read-write lock which is used for partition loading: https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/coordinator/transaction/TransactionMarkerChannelManager.scala#L264. In fact, if the lock cannot be acquired, it means that it is being held in order to complete some loading operation, in which case it will definitely not attempt completion of the delayed operation. If this happens to occur on the last call to `checkAndComplete` after all markers have been written, then the transition to the completing state will never occur.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hachikuji Jason Gustafson
                Reporter:
                hachikuji Jason Gustafson
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: