Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16296

Join new node to cluster failing on C* version 4

    XMLWordPrintableJSON

Details

    Description

      Hi, i'm using cassandra 4.0-beta3,
      Join new node failing,
      Current status:

      [root@rc801 ~]# nodetool status
      Datacenter: V4CH
      ================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address      Load        Tokens  Owns (effective)  Host ID                               Rack
      UN  1.1.1.1      356.55 KiB  128     ?                 92ae4c39-edb3-4e67-8623-b49fd8301b66  RAC1
      UN  2.2.2.2      424.36 KiB  128     ?                 d80647a8-32b2-4a8f-8022-f5ae3ce8fbb2  RAC1
      
      Datacenter: V4NJ
      ================
      Status=Up/Down
      |/ State=Normal/Leaving/Joining/Moving
      --  Address      Load        Tokens  Owns (effective)  Host ID                               Rack
      UN  3.3.3.3      322.97 KiB  128     ?                 0106ce0b-18be-4321-8247-843076ff42e6  RAC1
      UN  4.4.4.4      297.6 KiB   128     ?                 d81e2b7d-9221-4b1b-86c8-e8956e3eec07  RAC1
      UN  5.5.5.5      324.75 KiB  128     ?                 5dfbaf14-8310-4d64-b61a-9da0a7a8a620  RAC1
      

      Error on new host (7.7.7.7):

      INFO  [main] 2020-11-22 09:27:36,592 RangeStreamer.java:323 - Bootstrap: range Full(/7.7.7.7:7000,(-7734271291904743823,-7725025391901227341]) exists on Full(/1.1.1.1:7000,(-7734271291904743823,-7725025391901227341]) for keyspace system_traces
      ERROR [main] 2020-11-22 09:27:36,609 CassandraDaemon.java:817 - Exception encountered during startup
      java.lang.IllegalStateException: Multiple strict sources found for Full(/7.7.7.7:7000,(9208454697546654053,-9212763148842799409]), sources: [Full(/2.2.2.2:7000,(9189373804905478622,-9212763148842799409]), Full(/1.1.1.1:7000,(9189373804905478622,-9212763148842799409])]
              at org.apache.cassandra.dht.RangeStreamer.calculateRangesToFetchWithPreferredEndpoints(RangeStreamer.java:517)
              at org.apache.cassandra.dht.RangeStreamer.calculateRangesToFetchWithPreferredEndpoints(RangeStreamer.java:383)
              at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:320)
              at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81)
              at org.apache.cassandra.service.StorageService.startBootstrap(StorageService.java:1635)
              at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1612)
              at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:931)
              at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:892)
              at org.apache.cassandra.service.StorageService.initServer(StorageService.java:699)
              at org.apache.cassandra.service.StorageService.initServer(StorageService.java:635)
              at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:407)
              at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:671)
              at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:795)
      INFO  [StorageServiceShutdownHook] 2020-11-22 09:27:36,616 HintsService.java:220 - Paused hints dispatch
      WARN  [StorageServiceShutdownHook] 2020-11-22 09:27:36,617 Gossiper.java:1849 - No local state, state is in silent shutdown, or node hasn't joined, not announcing shutdown
      INFO  [StorageServiceShutdownHook] 2020-11-22 09:27:36,617 MessagingService.java:441 - Waiting for messaging service to quiesce
      

      Errors on seed node:

      INFO  [Messaging-EventLoop-3-37] 2020-11-22 09:26:54,121 InboundConnectionInitiator.java:459 - /7.7.7.7:7000(/7.7.7.7:45490)->/1.1.1.1:7000-URGENT_MESSAGES-f412b9ba messaging connection established, version = 12, framing = LZ4, encryption = disabled
      INFO  [Messaging-EventLoop-3-40] 2020-11-22 09:26:54,157 OutboundConnection.java:1151 - /1.1.1.1:7000(/1.1.1.1:38678)->/7.7.7.7:7000-URGENT_MESSAGES-3b23f639 successfully connected, version = 12, framing = LZ4, encryption = disabled
      INFO  [GossipStage:1] 2020-11-22 09:26:56,109 Gossiper.java:1248 - Node /7.7.7.7:7000 is now part of the cluster
      INFO  [GossipStage:1] 2020-11-22 09:26:56,147 Gossiper.java:1206 - InetAddress /7.7.7.7:7000 is now UP
      INFO  [Messaging-EventLoop-3-1] 2020-11-22 09:26:57,378 InboundConnectionInitiator.java:459 - /7.7.7.7:7000(/7.7.7.7:45504)->/1.1.1.1:7000-SMALL_MESSAGES-77c813d4 messaging connection established, version = 12, framing = LZ4, encryption = disabled
      INFO  [Messaging-EventLoop-3-38] 2020-11-22 09:26:57,383 OutboundConnection.java:1151 - /1.1.1.1:7000(/1.1.1.1:38680)->/7.7.7.7:7000-SMALL_MESSAGES-e75eaaa2 successfully connected, version = 12, framing = LZ4, encryption = disabled
      INFO  [Messaging-EventLoop-3-2] 2020-11-22 09:27:36,623 InboundConnectionInitiator.java:459 - /7.7.7.7:7000(/7.7.7.7:45534)->/1.1.1.1:7000-LARGE_MESSAGES-defc02dd messaging connection established, version = 12, framing = LZ4, encryption = disabled
      INFO  [Messaging-EventLoop-3-40] 2020-11-22 09:27:37,151 NoSpamLogger.java:92 - /1.1.1.1:7000->/7.7.7.7:7000-URGENT_MESSAGES-[no-channel] failed to connect
      io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: /7.7.7.7:7000
      Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused
              at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124)
              at io.netty.channel.unix.Socket.finishConnect(Socket.java:243)
              at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:672)
              at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:649)
              at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:529)
              at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:465)
              at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
              at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
              at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
              at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
              at java.base/java.lang.Thread.run(Thread.java:834)
      INFO  [GossipStage:1] 2020-11-22 09:27:57,154 Gossiper.java:1224 - InetAddress /7.7.7.7:7000 is now DOWN
      

      Looks that calculateRangesToFetchWithPreferredEndpoints added as part of "Transient Replication" https://github.com/apache/cassandra/commit/f7431b432875e334170ccdb19934d05545d2cebd

      It's new fresh cluster without data and the join failing after few seconds.

      Thank you,
      Yakir Gibraltar.

      Attachments

        Issue Links

          Activity

            People

              bereng Berenguer Blasi
              yakir.g Yakir Gibraltar
              Berenguer Blasi
              Andres de la Peña, Ekaterina Dimitrova, Tomasz Lasica
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 19h 40m
                  19h 40m