Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16671

Cassandra can return no row when the row columns have been deleted.

Agile BoardAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      It is the semantic of CQL that a (CQL) row exists as long as it has one non-null column (including the PK columns).

      To determine if a row has some non-null primary key, Cassandra relies on the row primary key liveness.

      For example:

      CREATE TABLE test (pk int, ck int, v int, PRIMARY KEY(pk, ck));
      INSERT INTO test(pk, ck, v) VALUES (1, 1, 1);
      DELETE v FROM test WHERE pk = 1 AND ck = 1
      SELECT v FROM test;
      

      will return

      v
      ---
      null 
      

      UPDATE statements do not set the row primary key liveness by consequence if the user had used an UPDATE statement instead of an INSERT the SELECT query would not have returned any rows.

      CASSANDRA-16226 introduced a regression by stopping early in the timestamp ordered logic if an UPDATE statement covering all the columns was found in an SSTable. As the row returned did not have a primary key liveness if another node was also returning a column deletion, the expected row will not be returned.

      The problem can be reproduced with the following test:

         @Test
          public void testSelectWithUpdatedColumnOnOneNodeAndColumnDeletionOnTheOther() throws Throwable
          {
              try (Cluster cluster = init(builder().withNodes(2).start()))
              {
                  cluster.schemaChange(withKeyspace("CREATE TABLE %s.tbl (pk int, ck text, v int, PRIMARY KEY (pk, ck))"));
                  cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.tbl (pk, ck, v) VALUES (1, '1', 1) USING TIMESTAMP 1000"));
                  cluster.get(1).flush(KEYSPACE);
                  cluster.get(1).executeInternal(withKeyspace("UPDATE %s.tbl USING TIMESTAMP 2000 SET v = 2 WHERE pk = 1 AND ck = '1'"));
                  cluster.get(1).flush(KEYSPACE);
      
                  cluster.get(2).executeInternal(withKeyspace("DELETE v FROM %s.tbl USING TIMESTAMP 3000 WHERE pk=1 AND ck='1'"));
                  cluster.get(2).flush(KEYSPACE);
      
                  assertRows(cluster.coordinator(2).execute(withKeyspace("SELECT * FROM %s.tbl WHERE pk=1 AND ck='1'"), ConsistencyLevel.ALL),
                             row(1, "1", null)); // <-- FAIL
                  assertRows(cluster.coordinator(2).execute(withKeyspace("SELECT v FROM %s.tbl WHERE pk=1 AND ck='1'"), ConsistencyLevel.ALL),
                             row((Integer) null));
      
              }
          }
      

      cc: Caleb Rackliffe, Alex Petrov

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            blerer Benjamin Lerer Assign to me
            blerer Benjamin Lerer
            Benjamin Lerer
            Caleb Rackliffe
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 2h
                2h

                Slack

                  Issue deployment