Details
-
Bug
-
Status: Resolved
-
Low
-
Resolution: Not A Problem
-
None
-
None
-
None
-
Amazon AWS Linux, Large instance (8gig of RAM, ephemeral storage). 12 Node cluster. Replication Factor 3, all queries performed with LOCAL_QUORUM
-
Low
Description
We upgraded from Cassandra 1.1.2 to 1.1.9 yesterday. All indications are the upgrade went well. Repair works as expected, and all our data is available. Performance is as good, if not better, than it was previously.
However, nodetool ring is reporting inconsistent and incorrect results. This was my ring information before the upgrade:
Address DC Rack Status State Load Effective-Ownership Token
Token(bytes[eaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8])
10.0.4.22 us-east 1a Up Normal 77.75 GB 25.00% Token(bytes[00000000000000000000000000000001])
10.0.10.23 us-east 1d Up Normal 82.68 GB 25.00% Token(bytes[15555555555555555555555555555555])
10.0.8.20 us-east 1c Up Normal 81.72 GB 25.00% Token(bytes[2aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa])
10.0.4.23 us-east 1a Up Normal 82.65 GB 25.00% Token(bytes[40000000000000000000000000000000])
10.0.10.20 us-east 1d Up Normal 80.2 GB 25.00% Token(bytes[55555555555555555555555555555554])
10.0.8.23 us-east 1c Up Normal 77.06 GB 25.00% Token(bytes[6aaaaaaaaaaaaaaaaaaaaaaaaaaaaaac])
10.0.4.21 us-east 1a Up Normal 81.37 GB 25.00% Token(bytes[80000000000000000000000000000000])
10.0.10.24 us-east 1d Up Normal 83.37 GB 25.00% Token(bytes[95555555555555555555555555555558])
10.0.8.21 us-east 1c Up Normal 84.33 GB 25.00% Token(bytes[aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8])
10.0.4.25 us-east 1a Up Normal 79.91 GB 25.00% Token(bytes[c0000000000000000000000000000000])
10.0.10.21 us-east 1d Up Normal 83.46 GB 25.00% Token(bytes[d5555555555555555555555555555558])
10.0.8.24 us-east 1c Up Normal 90.66 GB 25.00% Token(bytes[eaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8])
This is my ring information after the upgrade:
10.0.4.22 us-east 1a Up Normal 77.74 GB 99.89% Token(bytes[00000000000000000000000000000001])
10.0.10.23 us-east 1d Up Normal 82.82 GB 64.14% Token(bytes[15555555555555555555555555555555])
10.0.8.20 us-east 1c Up Normal 81.89 GB 30.55% Token(bytes[2aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa])
10.0.4.23 us-east 1a Up Normal 82.77 GB 0.04% Token(bytes[40000000000000000000000000000000])
10.0.10.20 us-east 1d Up Normal 80.32 GB 0.04% Token(bytes[55555555555555555555555555555554])
10.0.8.23 us-east 1c Up Normal 77.07 GB 0.04% Token(bytes[6aaaaaaaaaaaaaaaaaaaaaaaaaaaaaac])
10.0.4.21 us-east 1a Up Normal 81.35 GB 0.04% Token(bytes[80000000000000000000000000000000])
10.0.10.24 us-east 1d Up Normal 83.49 GB 0.04% Token(bytes[95555555555555555555555555555558])
10.0.8.21 us-east 1c Up Normal 84.47 GB 0.04% Token(bytes[aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8])
10.0.4.25 us-east 1a Up Normal 80.11 GB 0.04% Token(bytes[c0000000000000000000000000000000])
10.0.10.21 us-east 1d Up Normal 83.5 GB 35.79% Token(bytes[d5555555555555555555555555555558])
10.0.8.24 us-east 1c Up Normal 90.72 GB 69.38% Token(bytes[eaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8])
We use ByteOrderedPartitioning (we hash our own keys), and as you can see from above, were achieving a somewhat equal distribution of data amongst the nodes.
The node that always seems to own 99% of the keys is the node that I run "nodetool ring" on. Running "nodetool ring" on two of these nodes at the same time resulted in:
From 10.0.4.22:
Address DC Rack Status State Load Effective-Ownership Token
Token(bytes[eaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8])
10.0.4.22 us-east 1a Up Normal 77.72 GB 99.89% Token(bytes[00000000000000000000000000000001])
10.0.10.23 us-east 1d Up Normal 82.74 GB 64.13% Token(bytes[15555555555555555555555555555555])
10.0.8.20 us-east 1c Up Normal 81.79 GB 30.55% Token(bytes[2aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa])
10.0.4.23 us-east 1a Up Normal 82.66 GB 0.04% Token(bytes[40000000000000000000000000000000])
10.0.10.20 us-east 1d Up Normal 80.21 GB 0.04% Token(bytes[55555555555555555555555555555554])
10.0.8.23 us-east 1c Up Normal 77.07 GB 0.04% Token(bytes[6aaaaaaaaaaaaaaaaaaaaaaaaaaaaaac])
10.0.4.21 us-east 1a Up Normal 81.38 GB 0.04% Token(bytes[80000000000000000000000000000000])
10.0.10.24 us-east 1d Up Normal 83.43 GB 0.04% Token(bytes[95555555555555555555555555555558])
10.0.8.21 us-east 1c Up Normal 84.42 GB 0.04% Token(bytes[aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8])
10.0.4.25 us-east 1a Up Normal 80.06 GB 0.04% Token(bytes[c0000000000000000000000000000000])
10.0.10.21 us-east 1d Up Normal 83.49 GB 35.80% Token(bytes[d5555555555555555555555555555558])
10.0.8.24 us-east 1c Up Normal 90.72 GB 69.37% Token(bytes[eaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8])
From 10.0.8.23:
Address DC Rack Status State Load Effective-Ownership Token
Token(bytes[eaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8])
10.0.4.22 us-east 1a Up Normal 77.72 GB 0.04% Token(bytes[00000000000000000000000000000001])
10.0.10.23 us-east 1d Up Normal 82.78 GB 0.04% Token(bytes[15555555555555555555555555555555])
10.0.8.20 us-east 1c Up Normal 81.79 GB 0.04% Token(bytes[2aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa])
10.0.4.23 us-east 1a Up Normal 82.66 GB 33.84% Token(bytes[40000000000000000000000000000000])
10.0.10.20 us-east 1d Up Normal 80.21 GB 67.51% Token(bytes[55555555555555555555555555555554])
10.0.8.23 us-east 1c Up Normal 77.07 GB 99.89% Token(bytes[6aaaaaaaaaaaaaaaaaaaaaaaaaaaaaac])
10.0.4.21 us-east 1a Up Normal 81.38 GB 66.09% Token(bytes[80000000000000000000000000000000])
10.0.10.24 us-east 1d Up Normal 83.43 GB 32.41% Token(bytes[95555555555555555555555555555558])
10.0.8.21 us-east 1c Up Normal 84.42 GB 0.04% Token(bytes[aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8])
10.0.4.25 us-east 1a Up Normal 80.06 GB 0.04% Token(bytes[c0000000000000000000000000000000])
10.0.10.21 us-east 1d Up Normal 83.49 GB 0.04% Token(bytes[d5555555555555555555555555555558])
10.0.8.24 us-east 1c Up Normal 90.72 GB 0.04% Token(bytes[eaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8])
I did a little digging, and do see some changes to how cassandra calculates the splits in 1.1.8 as part of CASSANDRA-4803 (StorageService.getSplits()), although I'm not familiar enough to tell if its a cause.