[CASSANDRA-13863] Speculative retry causes read repair even if read_repair_chance is 0.0. - ASF JIRA

Details

Type: Improvement
Status: Open
Priority: Normal
Resolution: Unresolved
Fix Version/s: None
Component/s: Legacy/Coordination
Labels:
None

Description

read_repair_chance = 0.0 and dclocal_read_repair_chance = 0.0 should cause no read repair, but read repair happens with speculative retry. I think read_repair_chance = 0.0 and dclocal_read_repair_chance = 0.0 should stop read repair completely because the user wants to stop read repair in some cases.

Case 1: TWCS users

The documentation states how to disable read repair.

While TWCS tries to minimize the impact of comingled data, users should attempt to avoid this behavior. Specifically, users should avoid queries that explicitly set the timestamp via CQL USING TIMESTAMP. Additionally, users should run frequent repairs (which streams data in such a way that it does not become comingled), and disable background read repair by setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.

Case 2. Strict SLA for read latency

In a peak time, read latency is a key for us but, read repair causes latency higher than no read repair. We can use anti entropy repair in off peak time for consistency.

Here is my procedure to reproduce the problem.

1. Create a cluster and set `hinted_handoff_enabled` to false.

$ ccm create -v 3.0.14 -n 3 cluster_3.0.14
$ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: true/hinted_handoff_enabled: false/' ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
$ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
hinted_handoff_enabled: false
hinted_handoff_enabled: false
hinted_handoff_enabled: false
$ ccm start

2. Create a keyspace and a table.

$ ccm node1 cqlsh
DROP KEYSPACE IF EXISTS ks1;
CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true;
CREATE TABLE ks1.t1 (
        key text PRIMARY KEY,
        value blob
    ) WITH bloom_filter_fp_chance = 0.01
        AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
        AND comment = ''
        AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
        AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
        AND crc_check_chance = 1.0
        AND dclocal_read_repair_chance = 0.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND min_index_interval = 128
        AND read_repair_chance = 0.0
        AND speculative_retry = 'ALWAYS';
QUIT;

3. Stop node2 and node3. Insert a row.

$ ccm node3 stop && ccm node2 stop && ccm status
Cluster: 'cluster_3.0.14'
----------------------
node1: UP
node3: DOWN
node2: DOWN

$ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 (key, value) values ('mmullass', bigintAsBlob(1));"
Current consistency level is ONE.
Now Tracing is enabled

Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501

 activity                                                                                            | timestamp                  | source    | source_elapsed
-----------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------
                                                                                  Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 |              0
 Parsing insert into ks1.t1 (key, value) values ('mmullass', bigintAsBlob(1)); [SharedPool-Worker-1] | 2017-09-12 23:59:42.319000 | 127.0.0.1 |           4323
                                                           Preparing statement [SharedPool-Worker-1] | 2017-09-12 23:59:42.320000 | 127.0.0.1 |           5250
                                             Determining replicas for mutation [SharedPool-Worker-1] | 2017-09-12 23:59:42.327000 | 127.0.0.1 |          11886
                                                        Appending to commitlog [SharedPool-Worker-3] | 2017-09-12 23:59:42.327000 | 127.0.0.1 |          12195
                                                         Adding to t1 memtable [SharedPool-Worker-3] | 2017-09-12 23:59:42.327000 | 127.0.0.1 |          12392
                                                                                    Request complete | 2017-09-12 23:59:42.328680 | 127.0.0.1 |          12680


$ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 where key = 'mmullass';"
Current consistency level is ONE.
Now Tracing is enabled

 key      | value
----------+--------------------
 mmullass | 0x0000000000000001

(1 rows)

Tracing session: 3420ce90-97cb-11e7-8ea7-c1bd4d549501

 activity                                                                   | timestamp                  | source    | source_elapsed
----------------------------------------------------------------------------+----------------------------+-----------+----------------
                                                         Execute CQL3 query | 2017-09-13 00:01:06.681000 | 127.0.0.1 |              0
 Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-1] | 2017-09-13 00:01:06.681000 | 127.0.0.1 |            296
                                  Preparing statement [SharedPool-Worker-1] | 2017-09-13 00:01:06.681000 | 127.0.0.1 |            561
               Executing single-partition query on t1 [SharedPool-Worker-2] | 2017-09-13 00:01:06.682000 | 127.0.0.1 |           1056
                         Acquiring sstable references [SharedPool-Worker-2] | 2017-09-13 00:01:06.682000 | 127.0.0.1 |           1142
                            Merging memtable contents [SharedPool-Worker-2] | 2017-09-13 00:01:06.682000 | 127.0.0.1 |           1206
                    Read 1 live and 0 tombstone cells [SharedPool-Worker-2] | 2017-09-13 00:01:06.682000 | 127.0.0.1 |           1455
                                                           Request complete | 2017-09-13 00:01:06.682794 | 127.0.0.1 |           1794

4. Start node2 and confirm node2 has no data.

$ ccm node2 start && ccm status
Cluster: 'cluster_3.0.14'
-------------------------
node1: UP
node3: DOWN
node2: UP

$ ccm node2 nodetool flush
$ ls ~/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db
ls: /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db: No such file or directory

5. Select the row from node2 and read repair works.

$ ccm node2 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 where key = 'mmullass';"
Current consistency level is ONE.
Now Tracing is enabled

 key | value
-----+-------

(0 rows)

Tracing session: 72a71fc0-97cb-11e7-83cc-a3af9d3da979

 activity                                                                                                                                                                                                                                | timestamp                  | source    | source_elapsed
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------
                                                                                                                                                                                                                      Execute CQL3 query | 2017-09-13 00:02:51.582000 | 127.0.0.2 |              0
                                                                                                                                                              Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-2] | 2017-09-13 00:02:51.583000 | 127.0.0.2 |           1112
                                                                                                                                                                                               Preparing statement [SharedPool-Worker-2] | 2017-09-13 00:02:51.583000 | 127.0.0.2 |           1412
                                                                                                                                                                                      reading data from /127.0.0.1 [SharedPool-Worker-2] | 2017-09-13 00:02:51.584000 | 127.0.0.2 |           2107
                                                                                                                                                                            Executing single-partition query on t1 [SharedPool-Worker-1] | 2017-09-13 00:02:51.585000 | 127.0.0.2 |           3492
                                                                                                                                                               Sending READ message to /127.0.0.1 [MessagingService-Outgoing-/127.0.0.1] | 2017-09-13 00:02:51.585000 | 127.0.0.2 |           3516
                                                                                                                                                                                      Acquiring sstable references [SharedPool-Worker-1] | 2017-09-13 00:02:51.585000 | 127.0.0.2 |           3595
                                                                                                                                                                                         Merging memtable contents [SharedPool-Worker-1] | 2017-09-13 00:02:51.585001 | 127.0.0.2 |           3673
                                                                                                                                                                                 Read 0 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-09-13 00:02:51.585001 | 127.0.0.2 |           3851
                                                                                                                                                            READ message received from /127.0.0.2 [MessagingService-Incoming-/127.0.0.2] | 2017-09-13 00:02:51.588000 | 127.0.0.1 |             33
                                                                                                                                                                                      Acquiring sstable references [SharedPool-Worker-2] | 2017-09-13 00:02:51.600000 | 127.0.0.1 |          12444
                                                                                                                                                                                         Merging memtable contents [SharedPool-Worker-2] | 2017-09-13 00:02:51.600000 | 127.0.0.1 |          12536
                                                                                                                                                                                 Read 1 live and 0 tombstone cells [SharedPool-Worker-2] | 2017-09-13 00:02:51.600000 | 127.0.0.1 |          12765
                                                                                                                                                                                  Enqueuing response to /127.0.0.2 [SharedPool-Worker-2] | 2017-09-13 00:02:51.600000 | 127.0.0.1 |          12929
                                                                                                                                                   Sending REQUEST_RESPONSE message to /127.0.0.2 [MessagingService-Outgoing-/127.0.0.2] | 2017-09-13 00:02:51.602000 | 127.0.0.1 |          14686
                                                                                                                                                REQUEST_RESPONSE message received from /127.0.0.1 [MessagingService-Incoming-/127.0.0.1] | 2017-09-13 00:02:51.603000 | 127.0.0.2 |             --
                                                                                                                                                                               Processing response from /127.0.0.1 [SharedPool-Worker-3] | 2017-09-13 00:02:51.610000 | 127.0.0.2 |             --
                                                                                                                                                                                            Initiating read-repair [SharedPool-Worker-3] | 2017-09-13 00:02:51.610000 | 127.0.0.2 |             --
 Digest mismatch: org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey(-4886857781295767937, 6d6d756c6c617373) (d41d8cd98f00b204e9800998ecf8427e vs f8e0f9262a889cd3ebf4e5d50159757b) [ReadRepairStage:1] | 2017-09-13 00:02:51.624000 | 127.0.0.2 |             --
                                                                                                                                                                                                                        Request complete | 2017-09-13 00:02:51.586892 | 127.0.0.2 |           4892

6. As a result, node2 has the row.

$ ccm node2 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 where key = 'mmullass';"
Current consistency level is ONE.
Now Tracing is enabled

 key      | value
----------+--------------------
 mmullass | 0x0000000000000001

(1 rows)

Tracing session: 78526330-97cb-11e7-83cc-a3af9d3da979

 activity                                                                                 | timestamp                  | source    | source_elapsed
------------------------------------------------------------------------------------------+----------------------------+-----------+----------------
                                                                       Execute CQL3 query | 2017-09-13 00:03:01.091000 | 127.0.0.2 |              0
               Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 |            216
                                                Preparing statement [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 |            390
                                       reading data from /127.0.0.1 [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 |            808
                             Executing single-partition query on t1 [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 |           1041
             READ message received from /127.0.0.2 [MessagingService-Incoming-/127.0.0.2] | 2017-09-13 00:03:01.092000 | 127.0.0.1 |             33
                Sending READ message to /127.0.0.1 [MessagingService-Outgoing-/127.0.0.1] | 2017-09-13 00:03:01.092000 | 127.0.0.2 |           1036
                             Executing single-partition query on t1 [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 |            189
                                       Acquiring sstable references [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 |           1113
                                       Acquiring sstable references [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 |            276
                                          Merging memtable contents [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 |           1172
                                          Merging memtable contents [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 |            332
 REQUEST_RESPONSE message received from /127.0.0.1 [MessagingService-Incoming-/127.0.0.1] | 2017-09-13 00:03:01.093000 | 127.0.0.2 |             --
                                  Read 1 live and 0 tombstone cells [SharedPool-Worker-1] | 2017-09-13 00:03:01.093000 | 127.0.0.1 |            565
                                   Enqueuing response to /127.0.0.2 [SharedPool-Worker-1] | 2017-09-13 00:03:01.093000 | 127.0.0.1 |            648
    Sending REQUEST_RESPONSE message to /127.0.0.2 [MessagingService-Outgoing-/127.0.0.2] | 2017-09-13 00:03:01.093000 | 127.0.0.1 |            783
                                Processing response from /127.0.0.1 [SharedPool-Worker-1] | 2017-09-13 00:03:01.094000 | 127.0.0.2 |             --
                                             Initiating read-repair [SharedPool-Worker-1] | 2017-09-13 00:03:01.099000 | 127.0.0.2 |             --
                                  Read 1 live and 0 tombstone cells [SharedPool-Worker-2] | 2017-09-13 00:03:01.101000 | 127.0.0.2 |          10113
                                                                         Request complete | 2017-09-13 00:03:01.092830 | 127.0.0.2 |           1830

$ ccm node2 nodetool flush
$ ls ~/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db
/Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-ec659e0097ca11e78ea7c1bd4d549501/mc-1-big-Data.db

$ ~/.ccm/repository/3.0.14/tools/bin/sstabledump /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-ec659e0097ca11e78ea7c1bd4d549501/mc-1-big-Data.db -k mmullass
[
  {
    "partition" : {
      "key" : [ "mmullass" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 36,
        "liveness_info" : { "tstamp" : "2017-09-12T14:59:42.312969Z" },
        "cells" : [
          { "name" : "value", "value" : "0000000000000001" }
        ]
      }
    ]
  }
]

In CASSANDRA-11409, Cameron Zemek commented this was not a bug. So I filed this issue as an improvement.

Attachments

speculative retries.pdf
06/Oct/17 05:32
830 kB
Shogo Hoshii
0001-Use-read_repair_chance-when-starting-repairs-due-to-.patch
11/Oct/17 05:15
2 kB
Murukesh Mohanan

Issue Links

Add Link

is related to

CASSANDRA-10726 Read repair inserts should not be blocking

Resolved

Delete this link

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

Speculative retry causes read repair even if read_repair_chance is 0.0.

Details

Description

1. Create a cluster and set `hinted_handoff_enabled` to false.

2. Create a keyspace and a table.

3. Stop node2 and node3. Insert a row.

4. Start node2 and confirm node2 has no data.

5. Select the row from node2 and read repair works.

6. As a result, node2 has the row.

Attachments

Attachments

Issue Links

Activity

People

Dates

Agile

Slack

Issue deployment

Details

Description

1. Create a cluster and set hinted_handoff_enabled to false.

2. Create a keyspace and a table.

3. Stop node2 and node3. Insert a row.

4. Start node2 and confirm node2 has no data.

5. Select the row from node2 and read repair works.

6. As a result, node2 has the row.

Attachments

Attachments

Issue Links

Activity

People

Dates

Agile

Slack

Issue deployment

1. Create a cluster and set `hinted_handoff_enabled` to false.