Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Availability - Unavailable
-
Critical
-
Challenging
-
User Report
-
All
-
None
-
Description
We see this exception around nodes crashing and trying to do a host replacement; this error appears to be correlated around multiple node failures.
A simplified case to trigger this is the following
*) Have a N node cluster
*) Shutdown all N nodes
*) Bring up N-1 nodes (at least 1 seed, else replace seed)
*) Host replace the N-1th node -> this will fail with the above
The reason this happens is that the N-1th node isn’t gossiping anymore, and the existing nodes do not have its details in gossip (but have the details in the peers table), so the host replacement fails as the node isn’t known in gossip.
This affects all versions (tested 3.0 and trunk, assume 2.2 as well)
Attachments
Issue Links
- links to