Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-1316

Read repair does not always work correctly


    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Fix Version/s: 0.6.4
    • Component/s: None
    • Labels:


      Read repair does not always work. At the least, we allow violation of the CL.ALL contract. To reproduce, create a three node cluster with RF=3, and json2sstable one of the attached json files on each node. This creates a row whose key is 'test' with 9 columns, but only 3 columns are on each machine. If you get_count this row in quick succession at CL.ALL, sometimes you will receive a count of 6, sometimes 9. After the ReadRepairManager has sent the repairs, you will always get 9, which is the desired behavior.

      I have another data set obtained in the wild which never fully repairs for some reason, but it's a bit large to attach (600ish columns per machine.) I'm still trying to figure out why RR isn't working on this set, but I always get different results when reading at any CL including ALL, no matter how long I wait or how many reads I do.


        1. 001_correct_responsecount_in_RRR.txt
          1 kB
          Brandon Williams
        2. 1316-RRM.txt
          7 kB
          Jonathan Ellis
        3. cassandra-1.json
          0.1 kB
          Brandon Williams
        4. cassandra-2.json
          0.1 kB
          Brandon Williams
        5. cassandra-3.json
          0.1 kB
          Brandon Williams
        6. RRR-v2.txt
          3 kB
          Jonathan Ellis



            • Assignee:
              brandon.williams Brandon Williams
              brandon.williams Brandon Williams
            • Votes:
              0 Vote for this issue
              3 Start watching this issue


              • Created: