[CASSANDRA-2072] Race condition during decommission - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Low
Resolution: Fixed
Fix Version/s: 0.7.1
Component/s: None
Labels:
None

Severity:
Low

Description

Occasionally when decommissioning a node, there is a race condition that occurs where another node will never remove the token and thus propagate it again with a state of down. With ~~CASSANDRA-1900~~ we can solve this, but it shouldn't occur in the first place.

Given nodes A, B, and C, if you decommission B it will stream to A and C. When complete, B will decommission and receive this stacktrace:

ERROR 00:02:40,282 Fatal exception in thread Thread[Thread-5,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:62)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:387)
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:91

At this point A will show it is removing B's token, but C will not and instead its failure detector will report that B is dead, and nodetool ring on C shows B in a leaving/down state. In another gossip round, C will propagate this state back to A.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

0001-announce-having-left-the-ring-for-RING_DELAY-on-deco.patch
28/Jan/11 20:28
1 kB
Brandon Williams
0002-Improve-TRACE-logging-for-Gossiper.patch
28/Jan/11 20:28
4 kB
Brandon Williams
0003-Remove-endpoint-state-when-expiring-justRemovedEndpo.patch
31/Jan/11 19:44
2 kB
Brandon Williams

Activity

People

Assignee:: Brandon Williams

Reporter:: Brandon Williams

Authors:: Brandon Williams

Reviewers:: Gary Dusbabek

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 28/Jan/11 00:06

Updated:: 16/Apr/19 09:33

Resolved:: 03/Feb/11 20:17