Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
Although not a common scenario, I recently encountered an issue where the user accidentally deleted the SCM certs from the local certs directory for all SCMs.
On restart, the primordial node self signs a new cert and starts up fine.
However the non-primordial members in the HA ring, request a CSR to the primordial node. This request doesn't go through as it requires a quorum i.e ratis needs to be initialised . In order for RATIS to be initialised, it needs to verify certs. This leads to a dead-lock like situation.
Ratis is needed for CSR (getCert) request as it persists the signed Cub ca cert in its rocksdb.
Like I mentioned in the beginning, this is not a common case but since we already persist certs in the SCM rocksdb, a tool to restore certs should be helpful in this case.
Attachments
Issue Links
- links to