Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16577

Node waits for schema agreement on removed nodes

    XMLWordPrintableJSON

Details

    Description

      CASSANDRA-15158 might have introduced a bug where bootstrapping nodes wait for schema agreement from nodes that have been removed if token allocation for keyspace is enabled.

       

      It is fairly easy to reproduce with the following steps:

      // Create 3 node cluster
      ccm create test --vnodes -n 3 -s -v 3.11.10
      
      // Remove two nodes
      ccm node2 decommission
      ccm node3 decommission
      ccm node2 remove
      ccm node3 remove
      
      // Create keyspace to change the schema. It works if the schema never changes.
      ccm node1 cqlsh -x "CREATE KEYSPACE k WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};"
      
      // Add allocate parameter
      ccm updateconf 'allocate_tokens_for_keyspace: k'
      
      // Add node2 again to cluster
      ccm add node2 -i 127.0.0.2 -j 7200 -r 2200
      ccm node2 start

       

      This will cause node2 to throw exception on startup:

      WARN  [main] 2021-04-08 14:10:53,272 StorageService.java:941 - There are nodes in the cluster with a different schema version than us we did not merged schemas from, our version : (a5da47ec-ffe3-3111-b2f3-325f771f1539), outstanding versions -> endpoints : {8e9ec79e-5ed2-3949-8ac8-794abfee3837=[/127.0.0.3]}
      ERROR [main] 2021-04-08 14:10:53,274 CassandraDaemon.java:803 - Exception encountered during startup
      java.lang.RuntimeException: Didn't receive schemas for all known versions within the timeout
              at org.apache.cassandra.service.StorageService.waitForSchema(StorageService.java:947) ~[apache-cassandra-3.11.10.jar:3.11.10]
              at org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:206) ~[apache-cassandra-3.11.10.jar:3.11.10]
              at org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:177) ~[apache-cassandra-3.11.10.jar:3.11.10]
              at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1073) ~[apache-cassandra-3.11.10.jar:3.11.10]
              at org.apache.cassandra.service.StorageService.initServer(StorageService.java:753) ~[apache-cassandra-3.11.10.jar:3.11.10]
              at org.apache.cassandra.service.StorageService.initServer(StorageService.java:687) ~[apache-cassandra-3.11.10.jar:3.11.10]
              at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:395) [apache-cassandra-3.11.10.jar:3.11.10]
              at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:633) [apache-cassandra-3.11.10.jar:3.11.10]
              at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:786) [apache-cassandra-3.11.10.jar:3.11.10]
      INFO  [StorageServiceShutdownHook] 2021-04-08 14:10:53,279 HintsService.java:209 - Paused hints dispatch
      WARN  [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 Gossiper.java:1670 - No local state, state is in silent shutdown, or node hasn't joined, not announcing shutdown
      INFO  [StorageServiceShutdownHook] 2021-04-08 14:10:53,280 MessagingService.java:985 - Waiting for messaging service to quiesce
      INFO  [ACCEPT-/127.0.0.2] 2021-04-08 14:10:53,281 MessagingService.java:1346 - MessagingService has terminated the accept() thread
      INFO  [StorageServiceShutdownHook] 2021-04-08 14:10:53,416 HintsService.java:209 - Paused hints dispatch

       

       

       

      Attachments

        Activity

          People

            brandon.williams Brandon Williams
            Jan Karlsson Jan Karlsson
            Brandon Williams
            Adam Holmberg, Andres de la Peña, Blake Eggleston
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: