Take the following example trace where static members are joining the group:
1. Static member with instance A joined the group with empty member, the coordinator generated member.id 1 for A and added it to the group. The group state is PreparingRebalance.
2. The group is formed and now we move on to CompletingRebalance.
3. Another member joins the group, causing it to transit back to PreparingRebalance, which would potentially send a REBALANCE_IN_PROGRESS to member A as well.
4. Member A gets the REBALANCE_IN_PROGRESS error, trying to re-join (again with an empty member.id)
5. The group is now advanced to CompletingRebalance again.
6. The group get the second join-group from the known instance A with an empty member.id, will generated a new member.id 2 and replace the member.id 1.
7. The group gets the assignment from leader which only includes member.id 1 and not member.id 2.
8. The assignment for member.id 1 is dropped on the broker side while the assignment for member.id 2 is set to an empty byte array.
9. The empty byte array is sent back to the instance A causing it the following error:
This error has to be triggered when quite a few cases are aligned together, and hence it was not triggered very frequently.
Personally I think there's a correlation with this error to the observed https://issues.apache.org/jira/browse/KAFKA-9659 as well, which I'd keep investigating (will update in this ticket).