[CASSANDRA-14699] Querying using an indexed clustering column yields no result when a row has been reinserted using an update following a delete - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Normal
Resolution: Unresolved
Fix Version/s: None
Component/s: Feature/2i Index
Labels:
- secondary_index

Severity:
Normal
Since Version:

3.11.0

Description

If you have a secondary index on a clustering column in a table and you delete a row from said table and then add it back again using an update, querying for the row using the indexed clustering column does not yield any result.

Dummy example to reproduce:

CREATE TABLE foo (
    a text,
    b text,
    c text,
    d text,
    e text,
    PRIMARY KEY (a, b, c)
);
CREATE INDEX ON foo (b);
CREATE INDEX ON foo (c);
CREATE INDEX ON foo (d);
CREATE INDEX ON foo (e);
update foo set d='4', e='5' where a='1' and b='2' and c='3';
delete from foo where a='1' and b='2' and c='3';
update foo set d='4', e='5' where a='1' and b='2' and c='3';

Queries on the indexed clustering columns, e.g.

select * from foo where b='2';
select * from foo where c='3';

yield no result. Querying on the other (indexed and non-indexed) columns work fine though.

Here's a comparison between the dump of the index for a clustering column and the index of a non-clustering column. As far as I can tell, the row is considered deleted in the index of b and c?

# Index for column c
/apache-cassandra-3.11.0/tools/bin # ./sstabledump /data/data/foo/foo-875bbb60b1ab11e8b7406d2c86545d91/.foo_b_idx/mc-1-big-Data.db
[
  {
    "partition" : {
      "key" : [ "2" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 34,
        "clustering" : [ "31", "3" ],
        "deletion_info" : { "marked_deleted" : "2018-09-06T08:05:10.093704Z", "local_delete_time" : "2018-09-06T08:05:10Z" },
        "cells" : [ ]
      }
    ]
  }
]

# Index for d
/apache-cassandra-3.11.0/tools/bin # ./sstabledump /data/data/foo/foo-875bbb60b1ab11e8b7406d2c86545d91/.foo_d_idx/mc-1-big-Data.db
[
  {
    "partition" : {
      "key" : [ "4" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 32,
        "clustering" : [ "31", "2", "3" ],
        "liveness_info" : { "tstamp" : "2018-09-06T08:05:13.986242Z" },
        "cells" : [ ]
      }
    ]
  }
]

This problem only occurs when the delete is followed by an update. If you instead use an insert, e.g.

update foo set d='4', e='5' where a='1' and b='2' and c='3';
delete from foo where a='1' and b='2' and c='3';
insert into foo (a, b, c, d, e) VALUES ('1', '2', '3', '4', '5');

all queries work and the dump for the indexed clustering columns look fine as far as I can tell:

[
  {
    "partition" : {
      "key" : [ "2" ],
      "position" : 0
    },
    "rows" : [
      {
        "type" : "row",
        "position" : 41,
        "clustering" : [ "31", "3" ],
        "liveness_info" : { "tstamp" : "2018-09-06T08:21:20.546530Z" },
        "deletion_info" : { "marked_deleted" : "2018-09-06T08:21:11.027171Z", "local_delete_time" : "2018-09-06T08:21:11Z" },
        "cells" : [ ]
      }
    ]
  }
]

I was able to reproduce this problem in both 3.11.0 and 3.11.3.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Jonathan Pellby

Votes:: 1 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 06/Sep/18 16:59

Updated:: 03/May/20 09:30