Description
When Streams detect a task migration event in one of its thread, today it will always let its trigger to call consumer.poll hoping it could trigger the rebalance and hence clean up the records buffered from the partitions that on longer owned. However, because the rebalance is based onĀ heartbeat responses which has a window of race, the rebalance is not always guaranteed to be triggered when task migration happens. As a result it could cause the records buffered in consumer to not be cleaned up and later be processed by Streams, realizing it no longer belongs to the thread, causing:
java.lang.IllegalStateException: Record's partition does not belong to this partition-group.
Note this issue is only relevant when EOS is turned on, and based the default heartbeat.interval.ms value (3 sec), the race likelihood should not be high.
Attachments
Issue Links
- links to