Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13403

nodetool repair breaks SASI index

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Patch Available
    • Priority: Normal
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: Feature/SASI
    • Labels:
    • Environment:

      3.10

    • Severity:
      Normal

      Description

      I've got table:

      CREATE TABLE cservice.bulks_recipients (
          recipient text,
          bulk_id uuid,
          datetime_final timestamp,
          datetime_sent timestamp,
          request_id uuid,
          status int,
          PRIMARY KEY (recipient, bulk_id)
      ) WITH CLUSTERING ORDER BY (bulk_id ASC)
          AND bloom_filter_fp_chance = 0.01
          AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
          AND comment = ''
          AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
          AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
          AND crc_check_chance = 1.0
          AND dclocal_read_repair_chance = 0.1
          AND default_time_to_live = 0
          AND gc_grace_seconds = 864000
          AND max_index_interval = 2048
          AND memtable_flush_period_in_ms = 0
          AND min_index_interval = 128
          AND read_repair_chance = 0.0
          AND speculative_retry = '99PERCENTILE';
      CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex';
      

      There are 11 rows in it:

      > select * from bulks_recipients;
      
      ...
      (11 rows)
      

      Let's query by index (all rows have the same bulk_id):

      > select * from bulks_recipients where bulk_id = baa94815-e276-4ca4-adda-5b9734e6c4a5;                                                 
      
      ...
      (11 rows)
      

      Ok, everything is fine.

      Now i'm doing nodetool repair --partitioner-range --job-threads 4 --full on each node in cluster sequentially.

      After it finished:

      > select * from bulks_recipients where bulk_id = baa94815-e276-4ca4-adda-5b9734e6c4a5;
      
      ...
      (2 rows)
      

      Only two rows.

      While the rows are actually there:

      > select * from bulks_recipients;
      
      ...
      (11 rows)
      

      If i issue an incremental repair on a random node, i can get like 7 rows after index query.

      Dropping index and recreating it fixes the issue. Is it a bug or am i doing the repair the wrong way?

        Attachments

        1. 3_nodes_compaction.log
          3 kB
          Ludovic Boutros
        2. 4_nodes_compaction.log
          55 kB
          Ludovic Boutros
        3. testSASIRepair.patch
          5 kB
          Ludovic Boutros
        4. CASSANDRA-13403.patch
          9 kB
          Ludovic Boutros

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              blind_oracle Igor Novgorodov
            • Votes:
              2 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated: