Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Availability
-
Normal
-
Normal
-
User Report
-
All
-
None
-
Description
CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait for schema agreement from nodes that have been removed if token allocation for keyspace is enabled.
It is fairly easy to reproduce with the following steps:
// Create 3 node cluster ccm create test --vnodes -n 3 -s -v 3.11.10 // Remove two nodes ccm node2 decommission ccm node3 decommission ccm node2 remove ccm node3 remove // Create keyspace to change the schema. It works if the schema never changes. ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};" // Add allocate parameter ccm updateconf 'allocate_tokens_for_keyspace: k' // Add node2 again to cluster ccm add node2 -i 127.0.0.2 -j 7200 -r 2200 ccm node2 start
This will cause node2 to throw exception on startup:
WARN [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are nodes in the cluster with a different schema version than us we did not merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), outstanding versions -> endpoints : {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]} ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception encountered during startup java.lang.RuntimeException: Didn't receive schemas for all known versions within the timeout at org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) ~[apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) [apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) [apache-cassandra-3.11.10.jar:3.11.10] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) [apache-cassandra-3.11.10.jar:3.11.10] INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 HintsService.java:209 - Paused hints dispatch WARN [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 - No local state, state is in silent shutdown, or node hasn't joined, not announcing shutdown INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 MessagingService.java:985 - Waiting for messaging service to quiesce INFO [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 - MessagingService has terminated the accept() thread INFO [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 HintsService.java:209 - Paused hints dispatch