Details
-
Bug
-
Status: Resolved
-
Low
-
Resolution: Fixed
-
None
-
Seen running cassandra 2.0.7 running on Red Hat Linux
-
Low
Description
After decommissioning a node, then re-bootstrapping it, it's not possible to truncate column families until cassandra is restarted.
Steps to reproduce:
- Start with a two node deployment (nodes A and B)
- Run nodetool decommission on node B
- Stop cassandra on node B
- Delete the contents of the cassandra data and commitlog directories
- Start cassandra on node B with node A as the seed
- Run cqlsh on node B and try to truncate a column family
- cqlsh displays: "Unable to complete request: one or more nodes were unavailable."
According to the logs node B seems to think that itself is down. The follow logs appear when the server is started and there are no further logs to indicate the B is now UP (A=10.225.45.150, B=10.225.45.151):
INFO [main] 2014-05-29 10:40:11,090 MessagingService.java (line 461) Starting Messaging Service on port 7000
INFO [HANDSHAKE-/10.225.45.150] 2014-05-29 10:40:11,106 OutboundTcpConnection.java (line 386) Handshaking version with /10.225.45.150
INFO [GossipStage:1] 2014-05-29 10:40:11,182 Gossiper.java (line 903) Node /10.225.45.150 is now part of the cluster
INFO [GossipStage:1] 2014-05-29 10:40:11,185 Gossiper.java (line 883) InetAddress /10.225.45.151 is now DOWN
INFO [RequestResponseStage:1] 2014-05-29 10:40:11,215 Gossiper.java (line 869) InetAddress /10.225.45.150 is now UP
This problem isn't hit if cassandra is restarted on node A while node B is stopped. The problem goes away if node B is restarted.