Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
IN SCM HA, the primary node starts up the ratis server while other bootstrapping nodes will get added to the ratis group. Now, if all the bootstrapping SCM's get stopped, the primary node will now step down from leadership as it will loose majority. If the bootstrapping nodes are now bootstrapped again, the bootsrapping node will try to first validate the cluster id from the leader SCM with the persisted cluster id , but as there is no leader existing, bootstrapping wil keep on failing and retrying until it shuts down.
The issue can be very easily simulated in kubernetes deployments, where bootstrap and init cmds are run repeatedly on every restart.
The Jira aims to bypass the cluster id validation if a bootstrapping node already has a cluster id.
Attachments
Issue Links
- links to