Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-9631

Unnecessary required filtering for query on indexed clustering key

Agile BoardAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 2.0.17, 2.1.8, 2.2.0 rc2
    • Legacy/CQL
    • Cassandra 2.1.6 vanilla; 3-node local cluster; OSX Yosemite 10.10.3; Installed with CCM.

    • Normal

    Description

      Let's create and populate a simple table composed of one partition key a, two clustering keys b & c, and one secondary index on a standard column e:

      $ cqlsh 127.0.0.1
      Connected to test21 at 127.0.0.1:9160.
      [cqlsh 4.1.1 | Cassandra 2.1.6-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.39.0]
      Use HELP for help.
      cqlsh> CREATE KEYSPACE test WITH REPLICATION={'class': 'SimpleStrategy', 'replication_factor': 3};
      cqlsh> CREATE TABLE test.table1 (
         ...     a int,
         ...     b int,
         ...     c int,
         ...     d int,
         ...     e int,
         ...     PRIMARY KEY (a, b, c)
         ... );
      cqlsh> CREATE INDEX table1_e ON test.table1 (e);
      cqlsh> INSERT INTO test.table1 (a, b, c, d, e) VALUES (1, 1, 1, 1, 1);
      (...)
      cqlsh> SELECT * FROM test.table1;
      
       a | b | c | d | e
      ---+---+---+---+---
       1 | 1 | 1 | 1 | 1
       1 | 1 | 2 | 2 | 2
       1 | 1 | 3 | 3 | 3
       1 | 2 | 1 | 1 | 3
       1 | 3 | 1 | 1 | 1
       2 | 4 | 1 | 1 | 1
      
      (6 rows)
      

      With such a schema, I am allowed to query on the indexed column without filtering by providing the first two elements of the primary key:

      cqlsh> SELECT * FROM test.table1 WHERE a=1 AND b=1 AND e=3;
      
       a | b | c | d | e
      ---+---+---+---+---
       1 | 1 | 3 | 3 | 3
      
      (1 rows)
      

      Let's now introduce an index on the first clustering key:

      cqlsh> CREATE INDEX table1_b ON test.table1 (b);
      

      Now, I expect the same query as above to work without filtering, but it's not:

      cqlsh> SELECT * FROM test.table1 WHERE a=1 AND b=1 AND e=3;
      Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING
      

      I think this is a bug on the way secondary indexes are accounted for when checking for unfiltered queries.

      Attachments

        1. 9631-2.0.txt
          11 kB
          Benjamin Lerer
        2. 9631-2.1.txt
          7 kB
          Benjamin Lerer
        3. 9631-2.2.txt
          8 kB
          Benjamin Lerer

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            blerer Benjamin Lerer Assign to me
            kdeldycke Kevin Deldycke
            Benjamin Lerer
            Tom Hobbs
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment