Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-10080

IllegalStateException after duplicate CompleteCommit append to transaction log

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.6.0
    • None
    • None

    Description

      We noticed this exception in the logs:

      java.lang.IllegalStateException: TransactionalId foo completing transaction state transition while it does not have a pending state                                                                            
              at kafka.coordinator.transaction.TransactionMetadata.$anonfun$completeTransitionTo$1(TransactionMetadata.scala:357)
              at kafka.coordinator.transaction.TransactionMetadata.completeTransitionTo(TransactionMetadata.scala:353)
              at kafka.coordinator.transaction.TransactionStateManager.$anonfun$appendTransactionToLog$3(TransactionStateManager.scala:595)                                                                                                                       
              at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
              at kafka.coordinator.transaction.TransactionMetadata.inLock(TransactionMetadata.scala:188)
              at kafka.coordinator.transaction.TransactionStateManager.$anonfun$appendTransactionToLog$15$adapted(TransactionStateManager.scala:587)                                                                                                              
              at kafka.server.DelayedProduce.onComplete(DelayedProduce.scala:126)
              at kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:70)
              at kafka.server.DelayedProduce.tryComplete(DelayedProduce.scala:107)
              at kafka.server.DelayedOperation.maybeTryComplete(DelayedOperation.scala:121)
              at kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:378)
              at kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:280)
              at kafka.cluster.DelayedOperations.checkAndCompleteAll(Partition.scala:122)
              at kafka.cluster.Partition.tryCompleteDelayedRequests(Partition.scala:1023)
              at kafka.cluster.Partition.updateFollowerFetchState(Partition.scala:740)
      

      After inspection, we found that there were two CompleteCommit entries in the transaction state log which explains the failed transition. Indeed the logic for writing the CompleteCommit message does seem prone to race conditions.

      Attachments

        Issue Links

          Activity

            People

              hachikuji Jason Gustafson
              hachikuji Jason Gustafson
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: