Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-12573

SASI index. Incorrect results for '%foo%bar%'-like search pattern.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Urgent
    • Resolution: Duplicate
    • None
    • None
    • Critical

    Description

      We use Cassandra 3.7 and have faced a strange behaviour of SELECT requests with "LIKE '%foo%bar%'" constraints on a column with SASI index.
      Below are few experiments that show this behaviour.

      Experiment 1:

      drop keyspace if exists kmv;
      create keyspace if not exists kmv WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor':'1'} ;
      
      use kmv;
      
      CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
      
      CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
       'mode': 'CONTAINS'
      };
      
      insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
      insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
      insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
      insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
      insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
      
      select c2 from kmv.kmv where c2 like '%w%a%';
      

      Expected result: qweasd, qwea1.
      Actual result: no rows.

      Experiment 2 (NOTE: definition of index is changed):

      drop keyspace if exists kmv;
      create keyspace if not exists kmv WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor':'1'} ;
      
      use kmv;
      
      CREATE TABLE if not exists kmv (id int primary key, c1 text, c2 text);
      
      CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
       'mode': 'CONTAINS',
       'analyzer_class': 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
       'analyzed': 'true'
      };
      
      insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
      insert into kmv (id, c1, c2) values (2, 'f22', 'qweasd') ;
      insert into kmv (id, c1, c2) values (3, 'f23', 'qwea1') ;
      insert into kmv (id, c1, c2) values (4, 'f24', '1qwe') ;
      insert into kmv (id, c1, c2) values (5, 'f25', 'asdqwe') ;
      
      select c2 from kmv.kmv where c2 like '%w%a%';
      

      Expected result: qweasd, qwea1.
      Actual result: asdqwe, qweasd, qwea1.

      Experiment 3 (NOTE: primary key is compound now and inserted data was changed):

      drop keyspace if exists kmv;
      create keyspace if not exists kmv WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor':'1'} ;
      
      use kmv;
      
      CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, c1));
      
      CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
       'mode': 'CONTAINS',
       'analyzer_class': 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
       'analyzed': 'true'
      };
      
      insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
      insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
      insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
      insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
      insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
      
      select c2 from kmv.kmv where c2 like '%w%a%';
      

      Expected result: qweasd, qwea1.
      Actual result: qwe, qweasd, qwea1, 1qwe, asdqwe.

      Experiment 4 (NOTE: search criteria is changed):

      drop keyspace if exists kmv;
      create keyspace if not exists kmv WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor':'1'} ;
      
      use kmv;
      
      CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY KEY(id, c1));
      
      CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING 'org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
       'mode': 'CONTAINS',
       'analyzer_class': 'org.apache.cassandra.index.sasi.analyzer.StandardAnalyzer',
       'analyzed': 'true'
      };
      
      insert into kmv (id, c1, c2) values (1, 'f21', 'qwe') ;
      insert into kmv (id, c1, c2) values (1, 'f22', 'qweasd') ;
      insert into kmv (id, c1, c2) values (1, 'f23', 'qwea1') ;
      insert into kmv (id, c1, c2) values (1, 'f24', '1qwe') ;
      insert into kmv (id, c1, c2) values (1, 'f25', 'asdqwe') ;
      
      select c2 from kmv.kmv where c2 like '%w22%a%';
      

      Expected result: no rows.
      Actual result: qweasd, qwea1, asdqwe.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mkrupits_jb Mikhail Krupitskiy
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: