Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
Description
I'm still looking into a PR failure in my client and noticed something a bit strange. I know that technically I should be using RequireStableFetchOffsets in my transaction tests to prevent rebalances while a transaction is not finalized. I'll be adding that.
However, these tests have never failed against zookeeper mode. The client goes through a lot of efforts to avoid needing KIP-447 behavior, and the assumption with localhost testing is that things run fast enough (and that there are enough guards) that problems would not be encountered.
That looks to not be true with a kraft broker, but looking at __transaction_state, the following looks to be especially problematic:
__transaction_state partition 33 offset 7 at [2022-10-18 11:15:37.821]
TxnMetadataKey(0) 9f87dc04dc3f4d5b15ef3072c531cf46327278307df8e149fa966462cd40c10b
TxnMetadataValue(0)
ProducerID 41
ProducerEpoch 0
TimeoutMillis 120000
State PrepareCommit
Topics __consumer_offsets=>[13] e7c7d971626fbaf4bfb33975e57089167939e6acabb4c4fc534eb148462e45cc=>[4 5 12 16]
LastUpdateTimestamp 1666113337821
StartTimestamp 1666113335311
__transaction_state partition 33 offset 8 at [2022-10-18 11:16:11.419]
TxnMetadataKey(0) 9f87dc04dc3f4d5b15ef3072c531cf46327278307df8e149fa966462cd40c10b
TxnMetadataValue(0)
ProducerID 41
ProducerEpoch 0
TimeoutMillis 120000
State CompleteCommit
Topics
LastUpdateTimestamp 1666113337821
StartTimestamp 1666113335311
I've captured that using my kcl tool.
Note that the transaction enters PrepareCommit at 11:15:37.821, and then enters CompleteCommit at 11:16:11.419. AFAICT, this means that in my single node kraft setup, the broker took 34 seconds to transition commit states internally.
I noticed this in tests because a rebalance happened between those 34 seconds, which caused duplicate consumption because transactional offset commits were not finalized and the old commits were picked up.
This ticket is related to KAFKA-14312, in that this failure is cropping up as I've worked around KAFKA-14312 within the client itself.