Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
Description
We've discovered that in some uncommon cases, the consumer could send out successive heartbeats without waiting for the response to come back. this might result in causing the consumer to revoke its just assigned assignments in some cases. For example:
The consumer first sends out a heartbeat with epoch=0 and memberId=''
The consumer then rapidly sends out another heartbeat with epoch=0 and memberId='' because it has not gotten any response and thus not updating its local state
The consumer receives assignments from the first heartbeat and reconciles its assignment.
Since the second heartbeat has epoch=0 and memberId='', the server will think this is a new member joining and therefore send out an empty assignment.
The consumer receives the response from the second heartbeat. Revoke all of its partitions.
There are 2 issues associate with this bug:
- inflight logic
- rapid poll: In the
KAFKA-16389we've observe consumer polling interval to be a few ms.