Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-11654

sstabledump is not able to properly print out SSTable that may contain historical (but "shadowed") row tombstone

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 3.0.6, 3.6
    • Legacy/Tools
    • Normal

    Description

      It is pretty trivial to reproduce. Here are the steps I used (on a single node C* 3.x cluster):

      echo "CREATE KEYSPACE IF NOT EXISTS testks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'};" | cqlsh
      echo "CREATE TABLE IF NOT EXISTS testks.testcf ( k int, c text, val0_int int, PRIMARY KEY (k, c) );" | cqlsh
      echo "INSERT INTO testks.testcf (k, c, val0_int) VALUES (1, 'c1', 100);" | cqlsh
      echo "DELETE FROM testks.testcf where k=1 and c='c1';" | cqlsh
      echo "INSERT INTO testks.testcf (k, c, val0_int) VALUES (1, 'c1', 100);" | cqlsh
      nodetool flush testks testcf
      echo "SELECT * FROM testks.testcf;" | cqlsh
      

      The last step from above will confirm that there is one live row in the testks.testcf table. However, if you now go to the actual SSTable file directory and run sstabledump like the following, you will see the row is still marked as deleted and no row content is shown:

      $ sstabledump ma-1-big-Data.db
      [
        {
          "partition" : {
            "key" : [ "1" ],
            "position" : 0
          },
          "rows" : [
            {
              "type" : "row",
              "position" : 18,
              "clustering" : [ "c1" ],
              "liveness_info" : { "tstamp" : 1461633248542342 },
              "deletion_info" : { "deletion_time" : 1461633248212499, "tstamp" : 1461633248 }
            }
          ]
        }
      ]
      

      This is reproduced in both latest 3.0.5 and 3.6-snapshot (i.e. trunk as of Apr 25, 2016).

      Looks like only row tombstone is affecting sstabledump. If you generate cell tombstones, even if you delete all non-PK & non-static columns in the row, as long as there is no explicit row delete (so the clustering is still considered alive), sstabledump will work just fine, see the following example steps:

      echo "CREATE KEYSPACE IF NOT EXISTS testks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'};" | cqlsh
      echo "CREATE TABLE IF NOT EXISTS testks.testcf ( k int, c text, val0_int int, val1_int int, PRIMARY KEY (k, c) );" | cqlsh
      echo "INSERT INTO testks.testcf (k, c, val0_int, val1_int) VALUES (1, 'c1', 100, 200);" | cqlsh
      echo "DELETE val0_int, val1_int FROM testks.testcf where k=1 and c='c1';" | cqlsh
      echo "INSERT INTO testks.testcf (k, c, val0_int, val1_int) VALUES (1, 'c1', 300, 400);" | cqlsh
      nodetool flush testks testcf
      echo "select * from testks.testcf;" | cqlsh
      
      $ sstabledump ma-1-big-Data.db
      [
        {
          "partition" : {
            "key" : [ "1" ],
            "position" : 0
          },
          "rows" : [
            {
              "type" : "row",
              "position" : 18,
              "clustering" : [ "c1" ],
              "liveness_info" : { "tstamp" : 1461634633566479 },
              "cells" : [
                { "name" : "val0_int", "value" : "300" },
                { "name" : "val1_int", "value" : "400" }
              ]
            }
          ]
        }
      ]
      

      Attachments

        Activity

          People

            yukim Yuki Morishita
            weideng Wei Deng
            Yuki Morishita
            Chris Lohfink
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: