Details
-
Improvement
-
Status: Open
-
Normal
-
Resolution: Unresolved
-
None
-
None
Description
read_repair_chance = 0.0 and dclocal_read_repair_chance = 0.0 should cause no read repair, but read repair happens with speculative retry. I think read_repair_chance = 0.0 and dclocal_read_repair_chance = 0.0 should stop read repair completely because the user wants to stop read repair in some cases.
The documentation states how to disable read repair.
While TWCS tries to minimize the impact of comingled data, users should attempt to avoid this behavior. Specifically, users should avoid queries that explicitly set the timestamp via CQL USING TIMESTAMP. Additionally, users should run frequent repairs (which streams data in such a way that it does not become comingled), and disable background read repair by setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
In a peak time, read latency is a key for us but, read repair causes latency higher than no read repair. We can use anti entropy repair in off peak time for consistency.
Here is my procedure to reproduce the problem.
1. Create a cluster and set hinted_handoff_enabled to false.
$ ccm create -v 3.0.14 -n 3 cluster_3.0.14 $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: true/hinted_handoff_enabled: false/' ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done hinted_handoff_enabled: false hinted_handoff_enabled: false hinted_handoff_enabled: false $ ccm start
2. Create a keyspace and a table.
$ ccm node1 cqlsh DROP KEYSPACE IF EXISTS ks1; CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'} AND durable_writes = true; CREATE TABLE ks1.t1 ( key text PRIMARY KEY, value blob ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.0 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = 'ALWAYS'; QUIT;
3. Stop node2 and node3. Insert a row.
$ ccm node3 stop && ccm node2 stop && ccm status Cluster: 'cluster_3.0.14' ---------------------- node1: UP node3: DOWN node2: DOWN $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 (key, value) values ('mmullass', bigintAsBlob(1));" Current consistency level is ONE. Now Tracing is enabled Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501 activity | timestamp | source | source_elapsed -----------------------------------------------------------------------------------------------------+----------------------------+-----------+---------------- Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 | 0 Parsing insert into ks1.t1 (key, value) values ('mmullass', bigintAsBlob(1)); [SharedPool-Worker-1] | 2017-09-12 23:59:42.319000 | 127.0.0.1 | 4323 Preparing statement [SharedPool-Worker-1] | 2017-09-12 23:59:42.320000 | 127.0.0.1 | 5250 Determining replicas for mutation [SharedPool-Worker-1] | 2017-09-12 23:59:42.327000 | 127.0.0.1 | 11886 Appending to commitlog [SharedPool-Worker-3] | 2017-09-12 23:59:42.327000 | 127.0.0.1 | 12195 Adding to t1 memtable [SharedPool-Worker-3] | 2017-09-12 23:59:42.327000 | 127.0.0.1 | 12392 Request complete | 2017-09-12 23:59:42.328680 | 127.0.0.1 | 12680 $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 where key = 'mmullass';" Current consistency level is ONE. Now Tracing is enabled key | value ----------+-------------------- mmullass | 0x0000000000000001 (1 rows) Tracing session: 3420ce90-97cb-11e7-8ea7-c1bd4d549501 activity | timestamp | source | source_elapsed ----------------------------------------------------------------------------+----------------------------+-----------+---------------- Execute CQL3 query | 2017-09-13 00:01:06.681000 | 127.0.0.1 | 0 Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-1] | 2017-09-13 00:01:06.681000 | 127.0.0.1 | 296 Preparing statement [SharedPool-Worker-1] | 2017-09-13 00:01:06.681000 | 127.0.0.1 | 561 Executing single-partition query on t1 [SharedPool-Worker-2] | 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1056 Acquiring sstable references [SharedPool-Worker-2] | 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1142 Merging memtable contents [SharedPool-Worker-2] | 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1206 Read 1 live and 0 tombstone cells [SharedPool-Worker-2] | 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1455 Request complete | 2017-09-13 00:01:06.682794 | 127.0.0.1 | 1794
4. Start node2 and confirm node2 has no data.
$ ccm node2 start && ccm status Cluster: 'cluster_3.0.14' ------------------------- node1: UP node3: DOWN node2: UP $ ccm node2 nodetool flush $ ls ~/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db ls: /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db: No such file or directory
5. Select the row from node2 and read repair works.
$ ccm node2 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 where key = 'mmullass';" Current consistency level is ONE. Now Tracing is enabled key | value -----+------- (0 rows) Tracing session: 72a71fc0-97cb-11e7-83cc-a3af9d3da979 activity | timestamp | source | source_elapsed -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+---------------- Execute CQL3 query | 2017-09-13 00:02:51.582000 | 127.0.0.2 | 0 Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-2] | 2017-09-13 00:02:51.583000 | 127.0.0.2 | 1112 Preparing statement [SharedPool-Worker-2] | 2017-09-13 00:02:51.583000 | 127.0.0.2 | 1412 reading data from /127.0.0.1 [SharedPool-Worker-2] | 2017-09-13 00:02:51.584000 | 127.0.0.2 | 2107 Executing single-partition query on t1 [SharedPool-Worker-1] | 2017-09-13 00:02:51.585000 | 127.0.0.2 | 3492 Sending READ message to /127.0.0.1 [MessagingService-Outgoing-/127.0.0.1] | 2017-09-13 00:02:51.585000 | 127.0.0.2 | 3516 Acquiring sstable references [SharedPool-Worker-1] | 2017-09-13 00:02:51.585000 | 127.0.0.2 | 3595 Merging memtable contents [SharedPool-Worker-1] | 2017-09-13 00:02:51.585001 | 127.0.0.2 | 3673 Read 0 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-09-13 00:02:51.585001 | 127.0.0.2 | 3851 READ message received from /127.0.0.2 [MessagingService-Incoming-/127.0.0.2] | 2017-09-13 00:02:51.588000 | 127.0.0.1 | 33 Acquiring sstable references [SharedPool-Worker-2] | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12444 Merging memtable contents [SharedPool-Worker-2] | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12536 Read 1 live and 0 tombstone cells [SharedPool-Worker-2] | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12765 Enqueuing response to /127.0.0.2 [SharedPool-Worker-2] | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12929 Sending REQUEST_RESPONSE message to /127.0.0.2 [MessagingService-Outgoing-/127.0.0.2] | 2017-09-13 00:02:51.602000 | 127.0.0.1 | 14686 REQUEST_RESPONSE message received from /127.0.0.1 [MessagingService-Incoming-/127.0.0.1] | 2017-09-13 00:02:51.603000 | 127.0.0.2 | -- Processing response from /127.0.0.1 [SharedPool-Worker-3] | 2017-09-13 00:02:51.610000 | 127.0.0.2 | -- Initiating read-repair [SharedPool-Worker-3] | 2017-09-13 00:02:51.610000 | 127.0.0.2 | -- Digest mismatch: org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey(-4886857781295767937, 6d6d756c6c617373) (d41d8cd98f00b204e9800998ecf8427e vs f8e0f9262a889cd3ebf4e5d50159757b) [ReadRepairStage:1] | 2017-09-13 00:02:51.624000 | 127.0.0.2 | -- Request complete | 2017-09-13 00:02:51.586892 | 127.0.0.2 | 4892
6. As a result, node2 has the row.
$ ccm node2 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 where key = 'mmullass';" Current consistency level is ONE. Now Tracing is enabled key | value ----------+-------------------- mmullass | 0x0000000000000001 (1 rows) Tracing session: 78526330-97cb-11e7-83cc-a3af9d3da979 activity | timestamp | source | source_elapsed ------------------------------------------------------------------------------------------+----------------------------+-----------+---------------- Execute CQL3 query | 2017-09-13 00:03:01.091000 | 127.0.0.2 | 0 Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 | 216 Preparing statement [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 | 390 reading data from /127.0.0.1 [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 | 808 Executing single-partition query on t1 [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 | 1041 READ message received from /127.0.0.2 [MessagingService-Incoming-/127.0.0.2] | 2017-09-13 00:03:01.092000 | 127.0.0.1 | 33 Sending READ message to /127.0.0.1 [MessagingService-Outgoing-/127.0.0.1] | 2017-09-13 00:03:01.092000 | 127.0.0.2 | 1036 Executing single-partition query on t1 [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 | 189 Acquiring sstable references [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 | 1113 Acquiring sstable references [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 | 276 Merging memtable contents [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 | 1172 Merging memtable contents [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 | 332 REQUEST_RESPONSE message received from /127.0.0.1 [MessagingService-Incoming-/127.0.0.1] | 2017-09-13 00:03:01.093000 | 127.0.0.2 | -- Read 1 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-09-13 00:03:01.093000 | 127.0.0.1 | 565 Enqueuing response to /127.0.0.2 [SharedPool-Worker-1] | 2017-09-13 00:03:01.093000 | 127.0.0.1 | 648 Sending REQUEST_RESPONSE message to /127.0.0.2 [MessagingService-Outgoing-/127.0.0.2] | 2017-09-13 00:03:01.093000 | 127.0.0.1 | 783 Processing response from /127.0.0.1 [SharedPool-Worker-1] | 2017-09-13 00:03:01.094000 | 127.0.0.2 | -- Initiating read-repair [SharedPool-Worker-1] | 2017-09-13 00:03:01.099000 | 127.0.0.2 | -- Read 1 live and 0 tombstone cells [SharedPool-Worker-2] | 2017-09-13 00:03:01.101000 | 127.0.0.2 | 10113 Request complete | 2017-09-13 00:03:01.092830 | 127.0.0.2 | 1830 $ ccm node2 nodetool flush $ ls ~/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-ec659e0097ca11e78ea7c1bd4d549501/mc-1-big-Data.db $ ~/.ccm/repository/3.0.14/tools/bin/sstabledump /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-ec659e0097ca11e78ea7c1bd4d549501/mc-1-big-Data.db -k mmullass [ { "partition" : { "key" : [ "mmullass" ], "position" : 0 }, "rows" : [ { "type" : "row", "position" : 36, "liveness_info" : { "tstamp" : "2017-09-12T14:59:42.312969Z" }, "cells" : [ { "name" : "value", "value" : "0000000000000001" } ] } ] } ]
In CASSANDRA-11409, Cameron Zemek commented this was not a bug. So I filed this issue as an improvement.
Attachments
Attachments
Issue Links
- is related to
-
CASSANDRA-10726 Read repair inserts should not be blocking
- Resolved