We had a Kafka cluster running 2.8 version with interbroker protocol set to 2.7. It had a number of topics and everything was fine.
Then we decided to update the interbroker protocol to 2.8 by the following procedure:
1. Run new brokers with the interbroker protocol set to 2.8.
2. Move the data from the old brokers to the new ones (normal partition reassignment API).
3. Decommission the old brokers.
At the stage 2 we had the problem: old brokers started failing on LeaderAndIsrRequest handling with
for multiple topics. Topics were not recreated.
We checked partition.metadata files and IDs there were indeed different from the values in ZooKeeper. It was fixed by deleting the metadata files (and letting them be recreated).
The logs, unfortunately, didn't show anything that might point to the cause of the issue (or it happened longer ago than we store the logs).
We tried to reproduce this also, but no success.
If the community can point out what to check or beware of in future, it will be great. We'll be happy to provide additional information if needed. Thank you!
Sorry for the ticket that might be not very actionable. We hope to at least rise awareness of this issue.