Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Won't Fix
-
None
-
None
-
None
-
Normal
Description
If a node failed and a cluster was restarted (which is common case on massive outages), replace_address fails with
Caused by: java.lang.RuntimeException: Cannot replace_address /172.19.56.97 because it doesn't exist in gossip jvm 1 | at org.apache.cassandra.service.StorageService.prepareReplacementInfo(StorageService.java:472) jvm 1 | at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:724) jvm 1 | at org.apache.cassandra.service.StorageService.initServer(StorageService.java:686) jvm 1 | at org.apache.cassandra.service.StorageService.initServer(StorageService.java:562)
Although neccessary information is saved in system tables on seed nodes, it is not loaded to gossip on seed node, so a replacement node cannot get this info.
Attached patch loads all information from system tables to gossip with generation 0 and fixes some bugs around this info on shadow gossip round.