Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-9631

Unnecessary required filtering for query on indexed clustering key

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 2.0.17, 2.1.8, 2.2.0 rc2
    • Legacy/CQL
    • Cassandra 2.1.6 vanilla; 3-node local cluster; OSX Yosemite 10.10.3; Installed with CCM.

    • Normal

    Description

      Let's create and populate a simple table composed of one partition key a, two clustering keys b & c, and one secondary index on a standard column e:

      $ cqlsh 127.0.0.1
      Connected to test21 at 127.0.0.1:9160.
      [cqlsh 4.1.1 | Cassandra 2.1.6-SNAPSHOT | CQL spec 3.1.1 | Thrift protocol 19.39.0]
      Use HELP for help.
      cqlsh> CREATE KEYSPACE test WITH REPLICATION={'class': 'SimpleStrategy', 'replication_factor': 3};
      cqlsh> CREATE TABLE test.table1 (
         ...     a int,
         ...     b int,
         ...     c int,
         ...     d int,
         ...     e int,
         ...     PRIMARY KEY (a, b, c)
         ... );
      cqlsh> CREATE INDEX table1_e ON test.table1 (e);
      cqlsh> INSERT INTO test.table1 (a, b, c, d, e) VALUES (1, 1, 1, 1, 1);
      (...)
      cqlsh> SELECT * FROM test.table1;
      
       a | b | c | d | e
      ---+---+---+---+---
       1 | 1 | 1 | 1 | 1
       1 | 1 | 2 | 2 | 2
       1 | 1 | 3 | 3 | 3
       1 | 2 | 1 | 1 | 3
       1 | 3 | 1 | 1 | 1
       2 | 4 | 1 | 1 | 1
      
      (6 rows)
      

      With such a schema, I am allowed to query on the indexed column without filtering by providing the first two elements of the primary key:

      cqlsh> SELECT * FROM test.table1 WHERE a=1 AND b=1 AND e=3;
      
       a | b | c | d | e
      ---+---+---+---+---
       1 | 1 | 3 | 3 | 3
      
      (1 rows)
      

      Let's now introduce an index on the first clustering key:

      cqlsh> CREATE INDEX table1_b ON test.table1 (b);
      

      Now, I expect the same query as above to work without filtering, but it's not:

      cqlsh> SELECT * FROM test.table1 WHERE a=1 AND b=1 AND e=3;
      Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING
      

      I think this is a bug on the way secondary indexes are accounted for when checking for unfiltered queries.

      Attachments

        1. 9631-2.2.txt
          8 kB
          Benjamin Lerer
        2. 9631-2.1.txt
          7 kB
          Benjamin Lerer
        3. 9631-2.0.txt
          11 kB
          Benjamin Lerer

        Activity

          People

            blerer Benjamin Lerer
            kdeldycke Kevin Deldycke
            Benjamin Lerer
            Tom Hobbs
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: