[CASSANDRA-10887] Pending range calculator gives wrong pending ranges for moves - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Urgent
Resolution: Fixed
Fix Version/s: 2.1.13, 2.2.5, 3.0.3, 3.3
Component/s: Feature/Lightweight Transactions, Legacy/Coordination
Labels:
- LWT

Severity:
Critical
Since Version:

0.8 beta 1

Description

My understanding is the PendingRangeCalculator is meant to calculate who should receive extra writes during range movements. However, it adds the wrong ranges for moves. An extreme example of this can be seen in the following reproduction. Create a 5 node cluster (I did this on 2.0.16 and 2.2.4) and a keyspace RF=3 and a simple table. Then start moving a node and immediately kill -9 it. Now you see a node as down and moving in the ring. Try a quorum write for a partition that is stored on that node - it will fail with a timeout. Further, all CAS reads or writes fail immediately with unavailable exception because they attempt to include the moving node twice. This is likely to be the cause of ~~CASSANDRA-10423~~.

In my example I had this ring:

127.0.0.1 rack1 Up Normal 170.97 KB 20.00% -9223372036854775808
127.0.0.2 rack1 Up Normal 124.06 KB 20.00% -5534023222112865485
127.0.0.3 rack1 Down Moving 108.7 KB 40.00% 1844674407370955160
127.0.0.4 rack1 Up Normal 142.58 KB 0.00% 1844674407370955161
127.0.0.5 rack1 Up Normal 118.64 KB 20.00% 5534023222112865484

Node 3 was moving to -1844674407370955160. I added logging to print the pending and natural endpoints. For ranges owned by node 3, node 3 appeared in pending and natural endpoints. The blockFor is increased to 3 so we’re effectively doing CL.ALL operations. This manifests as write timeouts and CAS unavailables when the node is down.

The correct pending range for this scenario is node 1 is gaining the range (-1844674407370955160, 1844674407370955160). So node 1 should be added as a destination for writes and CAS for this range, not node 3.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

CASSANDRA_10887_3.0.diff
06/Jan/16 23:55
42 kB
Sankalp Kohli
CASSANDRA_10887_2.2.diff
06/Jan/16 22:50
41 kB
Sankalp Kohli
CASSANDRA_10887_v3.diff
05/Jan/16 07:11
35 kB
Sankalp Kohli
CASSANDRA_10887_v2.diff
29/Dec/15 18:52
31 kB
Sankalp Kohli
CASSANDRA-10887.diff
19/Dec/15 00:12
28 kB
Sankalp Kohli

Issue Links

is duplicated by

CASSANDRA-10423 Paxos/LWT failures when moving node

Resolved

is related to

CASSANDRA-10423 Paxos/LWT failures when moving node

Resolved

Activity

People

Assignee:: Sankalp Kohli

Reporter:: Richard Low

Authors:: Sankalp Kohli

Reviewers:: Branimir Lambov

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 17/Dec/15 04:27

Updated:: 25/Oct/19 13:11

Resolved:: 08/Jan/16 14:33