Details
-
Improvement
-
Status: Resolved
-
Low
-
Resolution: Fixed
-
None
Description
When a node leaves the cluster for good (via removetoken, decommission, or is replaced via replace_token), we clean up all references to it in MS and Gossiper. However, we never remove the entry from the seeds list in Gossiper (if the node was a seed). This will cause Gossiper to still try to communicate with the removed node as we try to talk to a random seed on each round of gossip.
The attached patch will remove the node from seed list when it leaves the cluster, and further, will call the SeedProvider for an updated list of seeds. In a dynamic environment like EC2, when a node dies or is replaced, that node is never coming back. Thus is it advantageous to get a refreshed set of seeds to help with the network partition healing aspect of gossip (we dynamically retrieve that list in Priam). This makes seeds a bit more of a dynamic concept, but it's rather essential in a dynamic cloud environment.
I believe this also resolves that repeated log message like :
Nodes /10.217.XXX.YYY and /10.217.AAA.BBB have the same token 56713727820156410577229101240436610841. Ignoring /10.217.XXX.YYY