Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-11068

Entire row is compacted away if remaining cells are tombstones expiring after gc_grace_seconds

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Duplicate
    • None
    • None
    • None
    • Normal

    Description

      Assuming the following schema:

      CREATE TABLE simple.data (
          k text PRIMARY KEY,
          v int
      ) WITH gc_grace_seconds = 300;
      

      And the following queries:

      insert into simple.data (k, v) values ('blah', 1);
      delete v from simple.data where k='blah';
      

      Performing a select * from this table will return 1 row with a null value:

      cqlsh> select * from simple.data;
      
               k | v
      -----------+---------
            blah |    null
      

      Prior the 3.0, if I were to do a flush, the sstable representation of this table would include an empty cell and a tombstone:

      [
      {"key": "blah",
       "cells": [["","",1453747038457027],
                 ["v",1453747112,1453747112383096,"d"]]}
      ]
      

      As my gc_grace_seconds value is 300, if I wait 5 minutes and perform a compaction, the new sstable would omit the tombstone, but the empty cell would still be present:

      [
      {"key": "blah",
       "cells": [["","",1453747038457027]]}
      ]
      

      Performing the select * query would still yield the same result because of this.

      However, in 3.2.1 this does not seem to be the behavior, after deleting the 'v' cell, performing a flush and then waiting 5 minutes and doing a compact, what ends up happening is that the sstable completely disappears (presumably because there is no remaining data) and the select query emits 0 rows:

      cqlsh> select * from simple.data;
      
               k | v
      -----------+---------
      
      (0 rows)
      

      I'm unsure if this is by design or a bug, but it does represent a change between C* versions.

      I have not tried this for a table with clustering columns yet, but I assume that the behavior will be the same. (The problem only manifests for tables with no clustering columns).

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            slebresne Sylvain Lebresne Assign to me
            andrew.tolbert Andy Tolbert
            Sylvain Lebresne
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment