Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-6517

Loss of secondary index entries if nodetool cleanup called before compaction

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 2.0.5
    • None
    • Ubuntu 12.0.4 with 8+ GB RAM and 40GB hard disk for data directory.

    • Normal

    Description

      From time to time we had the feeling of not getting all results that should have been returned using secondary indexes. Now we tracked down some situations and found out, it happened:

      1) To primary keys that were already deleted and have been re-created later on

      2) After our nightly maintenance scripts were running

      We can reproduce now the following szenario:

      • create a row entry with an indexed column included
      • query it and use the secondary index criteria -> Success
      • delete it, query again -> entry gone as expected
      • re-create it with the same key, query it -> success again

      Now use in exactly that sequence

      nodetool cleanup
      nodetool flush
      nodetool compact

      When issuing the query now, we don't get the result using the index. The entry is indeed available in it's table when I just ask for the key. Below is the exact copy-paste output from CQL when I reproduced the problem with an example entry on on of our tables.

      mwerrch@mstc01401:/opt/cassandra$ current/bin/cqlsh Connected to 14-15-Cluster at localhost:9160.
      [cqlsh 4.1.0 | Cassandra 2.0.3 | CQL spec 3.1.1 | Thrift protocol 19.38.0] Use HELP for help.
      cqlsh> use mwerrch;
      cqlsh:mwerrch> desc tables;

      B4Container_Demo

      cqlsh:mwerrch> desc table "B4Container_Demo";

      CREATE TABLE "B4Container_Demo" (
      key uuid,
      archived boolean,
      bytes int,
      computer int,
      deleted boolean,
      description text,
      doarchive boolean,
      filename text,
      first boolean,
      frames int,
      ifversion int,
      imported boolean,
      jobid int,
      keepuntil bigint,
      nextchunk text,
      node int,
      recordingkey blob,
      recstart bigint,
      recstop bigint,
      simulationid bigint,
      systemstart bigint,
      systemstop bigint,
      tapelabel bigint,
      version blob,
      PRIMARY KEY (key)
      ) WITH COMPACT STORAGE AND
      bloom_filter_fp_chance=0.010000 AND
      caching='KEYS_ONLY' AND
      comment='demo' AND
      dclocal_read_repair_chance=0.000000 AND
      gc_grace_seconds=604800 AND
      index_interval=128 AND
      read_repair_chance=1.000000 AND
      replicate_on_write='true' AND
      populate_io_cache_on_flush='false' AND
      default_time_to_live=0 AND
      speculative_retry='NONE' AND
      memtable_flush_period_in_ms=0 AND
      compaction=

      {'class': 'SizeTieredCompactionStrategy'}

      AND
      compression=

      {'sstable_compression': 'LZ4Compressor'}

      ;

      CREATE INDEX mwerrch_Demo_computer ON "B4Container_Demo" (computer);

      CREATE INDEX mwerrch_Demo_node ON "B4Container_Demo" (node);

      CREATE INDEX mwerrch_Demo_recordingkey ON "B4Container_Demo" (recordingkey);

      cqlsh:mwerrch> INSERT INTO "B4Container_Demo" (key,computer,node) VALUES (78c70562-1f98-3971-9c28-2c3d8e09c10f, 50, 50); cqlsh:mwerrch> select key,node,computer from "B4Container_Demo" where computer=50;

      key | node | computer
      --------------------------------------------------
      78c70562-1f98-3971-9c28-2c3d8e09c10f | 50 | 50

      (1 rows)

      cqlsh:mwerrch> DELETE FROM "B4Container_Demo" WHERE key=78c70562-1f98-3971-9c28-2c3d8e09c10f;
      cqlsh:mwerrch> select key,node,computer from "B4Container_Demo" where computer=50;

      (0 rows)

      cqlsh:mwerrch> INSERT INTO "B4Container_Demo" (key,computer,node) VALUES (78c70562-1f98-3971-9c28-2c3d8e09c10f, 50, 50); cqlsh:mwerrch> select key,node,computer from "B4Container_Demo" where computer=50;

      key | node | computer
      --------------------------------------------------
      78c70562-1f98-3971-9c28-2c3d8e09c10f | 50 | 50

      (1 rows)

      **********************************
      Now we execute (maybe from a different shell so we don't have to close this session) from /opt/cassandra/current/bin directory:
      ./nodetool cleanup
      ./nodetool flush
      ./nodetool compact

      Going back to our CQL session the result will no longer be available if queried via the index:
      *********************************

      cqlsh:mwerrch> select key,node,computer from "B4Container_Demo" where computer=50;

      (0 rows)

      Attachments

        1. repro.sh
          0.6 kB
          Sam Tunnicliffe
        2. 0001-CASSANDRA-6517-Use-column-timestamp-to-check-for-del.patch
          5 kB
          Sam Tunnicliffe

        Activity

          People

            samt Sam Tunnicliffe
            awerrch Christoph Werres
            Sam Tunnicliffe
            Sylvain Lebresne
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: