Description
Apparently, if a consumer rejoins the group with the same subscription userdata that it previously sent, it will not trigger a rebalance. The one exception here is that the group leader will always trigger a rebalance when it rejoins the group.
This has implications for KIP-441, where we rely on asking an arbitrary thread to enforce the followup probing rebalances. Technically we do ask a thread living on the same instance as the leader, so the odds that the leader will be chosen aren't completely abysmal, but for any multithreaded application they are still at best only 50%.
Of course in general the userdata will have changed within a span of 10 minutes, so the actual likelihood of hitting this is much lower – it can only happen if the member's task offset sums remained unchanged. Realistically, this probably requires that the member only have fully-restored active tasks (encoded with the constant sentinel -2) and that no tasks be added or removed.
One solution would be to make sure the leader is responsible for the probing rebalance. To do this, we would need to somehow expose the memberId of the thread's main consumer to the partition assignor. I'm actually not sure if that's currently possible to figure out or not. If not, we could just assign the probing rebalance to every thread on the leader's instance. This shouldn't result in multiple followup rebalances as the rebalance schedule will be updated/reset on the first followup rebalance.
Another solution would be to make sure the userdata is always different. We could encode an extra bit that flip-flops, but then we'd have to persist the latest value somewhere/somehow. Alternatively we could just encode the next probing rebalance time in the subscription userdata, since that is guaranteed to always be different from the previous rebalance. This might get tricky though, and certainly wastes space in the subscription userdata. Also, this would only solve the problem for KIP-441 probing rebalances, meaning we'd have to individually ensure the userdata has changed for every type of followup rebalance (see related issue below). So the first proposal, requiring the leader trigger the rebalance, would be preferable.
Note that, imho, we should just allow anyone to trigger a rebalance by rejoining the group. But this would presumably require a broker-side change and thus we would still need a workaround for KIP-441 to work with brokers.
Related issue:
This also means the Streams workaround for KAFKA-9821 is not airtight, as we encode the followup rebalance in the member who is supposed to receive a revoked partition, rather than the member who is actually revoking said partition. While the member doing the revoking will be guaranteed to have different userdata, the member receiving the partition may not. Making it the responsibility of the leader to trigger any type of followup rebalance would solve this issue as well.
Note that other types of followup rebalance (version probing, static membership with host info change) are guaranteed to have a change in the subscription userdata, and will not hit this bug
Attachments
Issue Links
- fixes
-
KAFKA-10633 Constant probing rebalances in Streams 2.6
- Resolved
- links to