Details
-
Bug
-
Status: Resolved
-
Urgent
-
Resolution: Fixed
-
None
-
Availability - Process Crash
-
Critical
-
Normal
-
User Report
-
All
-
None
-
Description
Hi, i'm using cassandra 4.0-beta3,
Join new node failing,
Current status:
[root@rc801 ~]# nodetool status Datacenter: V4CH ================ Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 1.1.1.1 356.55 KiB 128 ? 92ae4c39-edb3-4e67-8623-b49fd8301b66 RAC1 UN 2.2.2.2 424.36 KiB 128 ? d80647a8-32b2-4a8f-8022-f5ae3ce8fbb2 RAC1 Datacenter: V4NJ ================ Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 3.3.3.3 322.97 KiB 128 ? 0106ce0b-18be-4321-8247-843076ff42e6 RAC1 UN 4.4.4.4 297.6 KiB 128 ? d81e2b7d-9221-4b1b-86c8-e8956e3eec07 RAC1 UN 5.5.5.5 324.75 KiB 128 ? 5dfbaf14-8310-4d64-b61a-9da0a7a8a620 RAC1
Error on new host (7.7.7.7):
INFO [main] 2020-11-22 09:27:36,592 RangeStreamer.java:323 - Bootstrap: range Full(/7.7.7.7:7000,(-7734271291904743823,-7725025391901227341]) exists on Full(/1.1.1.1:7000,(-7734271291904743823,-7725025391901227341]) for keyspace system_traces ERROR [main] 2020-11-22 09:27:36,609 CassandraDaemon.java:817 - Exception encountered during startup java.lang.IllegalStateException: Multiple strict sources found for Full(/7.7.7.7:7000,(9208454697546654053,-9212763148842799409]), sources: [Full(/2.2.2.2:7000,(9189373804905478622,-9212763148842799409]), Full(/1.1.1.1:7000,(9189373804905478622,-9212763148842799409])] at org.apache.cassandra.dht.RangeStreamer.calculateRangesToFetchWithPreferredEndpoints(RangeStreamer.java:517) at org.apache.cassandra.dht.RangeStreamer.calculateRangesToFetchWithPreferredEndpoints(RangeStreamer.java:383) at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:320) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:81) at org.apache.cassandra.service.StorageService.startBootstrap(StorageService.java:1635) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1612) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:931) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:892) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:699) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:635) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:407) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:671) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:795) INFO [StorageServiceShutdownHook] 2020-11-22 09:27:36,616 HintsService.java:220 - Paused hints dispatch WARN [StorageServiceShutdownHook] 2020-11-22 09:27:36,617 Gossiper.java:1849 - No local state, state is in silent shutdown, or node hasn't joined, not announcing shutdown INFO [StorageServiceShutdownHook] 2020-11-22 09:27:36,617 MessagingService.java:441 - Waiting for messaging service to quiesce
Errors on seed node:
INFO [Messaging-EventLoop-3-37] 2020-11-22 09:26:54,121 InboundConnectionInitiator.java:459 - /7.7.7.7:7000(/7.7.7.7:45490)->/1.1.1.1:7000-URGENT_MESSAGES-f412b9ba messaging connection established, version = 12, framing = LZ4, encryption = disabled INFO [Messaging-EventLoop-3-40] 2020-11-22 09:26:54,157 OutboundConnection.java:1151 - /1.1.1.1:7000(/1.1.1.1:38678)->/7.7.7.7:7000-URGENT_MESSAGES-3b23f639 successfully connected, version = 12, framing = LZ4, encryption = disabled INFO [GossipStage:1] 2020-11-22 09:26:56,109 Gossiper.java:1248 - Node /7.7.7.7:7000 is now part of the cluster INFO [GossipStage:1] 2020-11-22 09:26:56,147 Gossiper.java:1206 - InetAddress /7.7.7.7:7000 is now UP INFO [Messaging-EventLoop-3-1] 2020-11-22 09:26:57,378 InboundConnectionInitiator.java:459 - /7.7.7.7:7000(/7.7.7.7:45504)->/1.1.1.1:7000-SMALL_MESSAGES-77c813d4 messaging connection established, version = 12, framing = LZ4, encryption = disabled INFO [Messaging-EventLoop-3-38] 2020-11-22 09:26:57,383 OutboundConnection.java:1151 - /1.1.1.1:7000(/1.1.1.1:38680)->/7.7.7.7:7000-SMALL_MESSAGES-e75eaaa2 successfully connected, version = 12, framing = LZ4, encryption = disabled INFO [Messaging-EventLoop-3-2] 2020-11-22 09:27:36,623 InboundConnectionInitiator.java:459 - /7.7.7.7:7000(/7.7.7.7:45534)->/1.1.1.1:7000-LARGE_MESSAGES-defc02dd messaging connection established, version = 12, framing = LZ4, encryption = disabled INFO [Messaging-EventLoop-3-40] 2020-11-22 09:27:37,151 NoSpamLogger.java:92 - /1.1.1.1:7000->/7.7.7.7:7000-URGENT_MESSAGES-[no-channel] failed to connect io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: /7.7.7.7:7000 Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124) at io.netty.channel.unix.Socket.finishConnect(Socket.java:243) at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:672) at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:649) at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:529) at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:465) at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:834) INFO [GossipStage:1] 2020-11-22 09:27:57,154 Gossiper.java:1224 - InetAddress /7.7.7.7:7000 is now DOWN
Looks that calculateRangesToFetchWithPreferredEndpoints added as part of "Transient Replication" https://github.com/apache/cassandra/commit/f7431b432875e334170ccdb19934d05545d2cebd
It's new fresh cluster without data and the join failing after few seconds.
Thank you,
Yakir Gibraltar.
Attachments
Issue Links
- is a parent of
-
CASSANDRA-16411 Fix topology corruption on joining nodes without DC in dtests
- Resolved
- relates to
-
CASSANDRA-5953 Replication validation should reject RF > nodes in cluster
- Resolved
- links to