Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.6.2
-
None
-
None
Description
Using Kafka Streams 2.6.2 and running stateful aggregations with Exactly once semantics.
The processing logic is:
consume input records -> intermediate aggregate and buffer data in state store backed by change log topic -> punctuate every 15seconds - flush state store and send aggregated records downstream -> final aggregate operation and send to output topic
Since we use spot instances, one of the pod got restarted and rebalance was triggered and state was getting restored from changelog topic.
we noticed ProducerFenced exceptions:
org.apache.kafka.common.errors.ProducerFencedException: Producer attempted an
operation with an old epoch. Either there is a newer producer with the same transactionalId, or the producer's transaction has been expired by the broker.
After this a few partitions were stuck and no records were processed util we restarted the application.
We had configured:
transaction.timeout.ms to 30 seconds
session.timeout.ms to 30 seconds
could you please advise if there's any known fix for this edge case?