Description
From hai_lin in KAFKA-9181:
I did notice some issue after this patch, here is what I observe.
Consumer metadata might skip first metadata update, cause grouopSubscription is not reset. In my case, the consumer coordinator thread hijack the update by calling newMetadataRequestAndVersion with outdated groupSubscription before joinPrepare() happen. The groupSubscription will get reset later and it will eventually get update later, and this won't be an issue for initial consumer subscribe(since the groupSubscription is empty anyway), but it might happen the following subscribe when groupSubscription is not empty. This will create a discrepancy between subscription and groupSubscription, if any new metadata request happened in between, metadataTopics will return outdated group information.
The happy path
- Consumer call subscribe > Update needUpdated, bump up requestVersion and update subscription in SubscriptionState > prepareJoin() was call in first poll() to reset groupSubscription -> next time when metadata update was call and metadataTopics() returns subscription since groupSubscription is empty -> update call issue to broker to fetch partition information for new topic
In our case
- Consumer call subscribe > Update needUpdated, bump up requestVersion and update subscription(not groupSubscription) in SubscriptionState > Consumer Coordinator heartbeat thread call metadata request and SubscriptionState gave away the current requestVersion and outdated groupSubscription > making request for metadata update with outdated subscription -> request comes back to client and since requestVersion is up to latest, it reset needUpdated flag -> joinPrepare() called and reset groupSubscription > no new metadata update request follow cause needUpdated was reset -> metadata request will happen when metadata.max.age reaches.
Attachments
Issue Links
- links to