Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
3.7.0
Description
When running system tests for the new consumer, I've hit an issue where the HeartbeatRequestManager is sending out multiple concurrent CONSUMER_GROUP_REQUEST RPCs. The effect is the coordinator creates multiple members which causes downstream assignment problems.
Here's the order of events:
- Time 202: HearbeatRequestManager.poll() determines it's OK to send a request. In so doing, it updates the RequestState's lastSentMs to the current timestamp, 202
- Time 236: the response is received and response handler is invoked, setting the RequestState's lastReceivedMs to the current timestamp, 236
- Time 236: HearbeatRequestManager.poll() is invoked again, and it sees that it's OK to send a request. It creates another request, once again updating the RequestState's lastSentMs to the current timestamp, 236
- Time 237: HearbeatRequestManager.poll() is invoked again, and ERRONEOUSLY decides it's OK to send another request, despite one already in flight.
Here's the problem with requestInFlight():
public boolean requestInFlight() { return this.lastSentMs > -1 && this.lastReceivedMs < this.lastSentMs; }
On our case, lastReceivedMs is 236 and lastSentMs is also 236. So the received timestamp is equal to the sent timestamp, not less.
Attachments
Issue Links
- links to