Description
KIP-595 describes an extra condition on commitment here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-595%3A+A+Raft+Protocol+for+the+Metadata+Quorum#KIP595:ARaftProtocolfortheMetadataQuorum-Fetch. In order to ensure that a newly elected leader's committed entries cannot get lost, it must commit one record from its own epoch. This guarantees that its latest entry is larger (in terms of epoch/offset) than any previously written record which ensures that any future leader must also include it. This is the purpose of the LeaderChange record which is written to the log as soon as the leader gets elected.
We have this check implemented here: https://github.com/apache/kafka/blob/trunk/raft/src/main/java/org/apache/kafka/raft/LeaderState.java#L122. However, the check needs to be a strict inequality since the epoch start offset does not reflect the LeaderChange record itself. In other words, the check is off by one.
Attachments
Issue Links
- links to