Today we disable heartbeats if the state != MemberState.STABLE. And if a rebalance failed we set the state to UNJOINED. In the old API poll(long) it is okay since we always try to complete the rebalance successfully within the same call, so we would not be in UNJOINED or REBALANCING for a very long time.
But with the new poll(Duration) we may actually return while we are still in UNJOINED or REBALANCING and it may take some time (smaller than max.poll.interval but larger than session.timeout) before the next poll call, and since heartbeat is disabled during this period of time we could be kicked by the coordinator.
The proposal I have is
1) allow heartbeat to be sent during REBALANCING as well.
2) when join/sync response has retriable error, do not set the state to UNJOINED but stay with REBALANCING.