Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
IGNITE-16668 describes a design that is currently implemented in the PartitionReplicatorNodeRecovery class. This is needed to make sure that, when we restart an Ignite node having a volatile partition (which uses volatile Raft metastorage), and hence we break Raft assumptions about Raft metastorage being stored on a stable storage (i.e. its contents cannot be destroyed), we are still safe as the additional logic makes sure that the broken Raft assumptions do not cause any trouble.
The trouble that can be caused is the 'double-vote' problem. The scenario follows:
- Node A starts election and proposes itself as a candidate
- Node C votes for node A
- Node C restart and loses votedFor in its Raft metastorage
- Node B starts election and proposes itself as a candidate (in the same term in thich node A proposed itself)
- Node C votes for B (as it forgot that it already voted for A in this term)
- Both nodes A and B get elected as leaders, so 'at most one leader at a term' property of Raft is broken
Our test should try to get 2 leaders in the same term repeating this scenario. Probably, a lot of message cancellation/delaying will be needed (for example, to make sure that node B does not see node A self-proposals).
Attachments
Issue Links
- is related to
-
IGNITE-16668 Design in-memory raft group reconfiguration on node failure
- Resolved