We've noticed that reassigning a topic's partitions seems to adversely impact other topics. Specifically, followers for other topics fall out of the ISR.
While I'm not 100% sure about why this happens, the scenario seems to be as follows:
1. Reassignment is manually triggered on topic-partition X-Y, and broker A (which used to be a follower for X-Y) is no longer a follower.
2. Broker A makes `FetchRequest` including topic-partition X-Y to broker B, just after the reassignment.
3. Broker B can fulfill the `FetchRequest`, but while trying to do so it tries to record the position of "follower" A. This fails, because broker A is no longer a follower for X-Y (see exception below).
4. The entire `FetchRequest` request fails, and broker A's other followed topics start falling behind.
5. Depending on the length of the reassignment, this sequence repeats.
In step 3, we see exceptions like:
Does my assessment make sense? If so, this behaviour seems problematic. A few changes that might improve matters (assuming I'm on the right track):
1. `FetchRequest` should be able to return partial results
2. The broker fulfilling the `FetchRequest` could ignore the `NotAssignedReplicaException`, and return results without recording the not-any-longer-follower position.
This behaviour was experienced with 0.10.1.1, although looking at the changelogs and the code in question, I don't see any reason why it would have changed in later versions.
Am very interested to have some discussion on this. Thanks!