Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-10080

IllegalStateException after duplicate CompleteCommit append to transaction log

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.6.0
    • Component/s: None
    • Labels:
      None

      Description

      We noticed this exception in the logs:

      java.lang.IllegalStateException: TransactionalId foo completing transaction state transition while it does not have a pending state                                                                            
              at kafka.coordinator.transaction.TransactionMetadata.$anonfun$completeTransitionTo$1(TransactionMetadata.scala:357)
              at kafka.coordinator.transaction.TransactionMetadata.completeTransitionTo(TransactionMetadata.scala:353)
              at kafka.coordinator.transaction.TransactionStateManager.$anonfun$appendTransactionToLog$3(TransactionStateManager.scala:595)                                                                                                                       
              at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
              at kafka.coordinator.transaction.TransactionMetadata.inLock(TransactionMetadata.scala:188)
              at kafka.coordinator.transaction.TransactionStateManager.$anonfun$appendTransactionToLog$15$adapted(TransactionStateManager.scala:587)                                                                                                              
              at kafka.server.DelayedProduce.onComplete(DelayedProduce.scala:126)
              at kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:70)
              at kafka.server.DelayedProduce.tryComplete(DelayedProduce.scala:107)
              at kafka.server.DelayedOperation.maybeTryComplete(DelayedOperation.scala:121)
              at kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:378)
              at kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:280)
              at kafka.cluster.DelayedOperations.checkAndCompleteAll(Partition.scala:122)
              at kafka.cluster.Partition.tryCompleteDelayedRequests(Partition.scala:1023)
              at kafka.cluster.Partition.updateFollowerFetchState(Partition.scala:740)
      

      After inspection, we found that there were two CompleteCommit entries in the transaction state log which explains the failed transition. Indeed the logic for writing the CompleteCommit message does seem prone to race conditions.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hachikuji Jason Gustafson
                Reporter:
                hachikuji Jason Gustafson
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: