Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-24708

`ConvertToNotInOrInRule` has a bug which leads to wrong result

    XMLWordPrintableJSON

Details

    Description

      A user report this bug in maillist, I paste the content here.

      We are in the process of upgrading from Flink 1.9.3 to 1.13.3.  We have noticed that statements with either where UPPER(field) or LOWER(field) in combination with an IN do not always evaluate correctly. 

       

      The following test case highlights this problem.

       

       

      import org.apache.flink.streaming.api.datastream.DataStream;
      import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
      import org.apache.flink.table.api.Schema;
      import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
      
      public class TestCase {
          public static void main(String[] args) throws Exception {
              final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
              env.setParallelism(1);
      
              TestData testData = new TestData();
              testData.setField1("bcd");
              DataStream<TestData> stream = env.fromElements(testData);
              stream.print();  // To prevent 'No operators' error
      
              final StreamTableEnvironment tableEnvironment = StreamTableEnvironment.create(env);
              tableEnvironment.createTemporaryView("testTable", stream, Schema.newBuilder().build());
      
              // Fails because abcd is larger than abc
              tableEnvironment.executeSql("select *, '1' as run from testTable WHERE lower(field1) IN ('abcd', 'abc', 'bcd', 'cde')").print();
              // Succeeds because lower was removed
              tableEnvironment.executeSql("select *, '2' as run from testTable WHERE field1 IN ('abcd', 'abc', 'bcd', 'cde')").print();
              // These 4 succeed because the smallest literal is before abcd
              tableEnvironment.executeSql("select *, '3' as run from testTable WHERE lower(field1) IN ('abc', 'abcd', 'bcd', 'cde')").print();
              tableEnvironment.executeSql("select *, '4' as run from testTable WHERE lower(field1) IN ('abc', 'bcd', 'abhi', 'cde')").print();
              tableEnvironment.executeSql("select *, '5' as run from testTable WHERE lower(field1) IN ('cde', 'abcd', 'abc', 'bcd')").print();
              tableEnvironment.executeSql("select *, '6' as run from testTable WHERE lower(field1) IN ('cde', 'abc', 'abcd', 'bcd')").print();
              // Fails because smallest is not first
              tableEnvironment.executeSql("select *, '7' as run from testTable WHERE lower(field1) IN ('cdef', 'abce', 'abcd', 'ab', 'bcd')").print();
              // Succeeds
              tableEnvironment.executeSql("select *, '8' as run from testTable WHERE lower(field1) IN ('ab', 'cdef', 'abce', 'abcdefgh', 'bcd')").print();
      
              env.execute("TestCase");
          }
      
          public static class TestData {
              private String field1;
      
              public String getField1() {
                  return field1;
              }
      
              public void setField1(String field1) {
                  this.field1 = field1;
              }
          }
      }
      

       

      The job produces the following output:

      Empty set

      ---------------------------------++-------------------------------

      op                          field1                             run

      ---------------------------------++-------------------------------

      +I                             bcd                               2

      ---------------------------------++-------------------------------

      1 row in set

      ---------------------------------++-------------------------------

      op                          field1                             run

      ---------------------------------++-------------------------------

      +I                             bcd                               3

      ---------------------------------++-------------------------------

      1 row in set

      ---------------------------------++-------------------------------

      op                          field1                             run

      ---------------------------------++-------------------------------

      +I                             bcd                               4

      ---------------------------------++-------------------------------

      1 row in set

      ---------------------------------++-------------------------------

      op                          field1                             run

      ---------------------------------++-------------------------------

      +I                             bcd                               5

      ---------------------------------++-------------------------------

      1 row in set

      ---------------------------------++-------------------------------

      op                          field1                             run

      ---------------------------------++-------------------------------

      +I                             bcd                               6

      ---------------------------------++-------------------------------

      1 row in set

      Empty set

      ---------------------------------++-------------------------------

      op                          field1                             run

      ---------------------------------++-------------------------------

      +I                             bcd                               8

      ---------------------------------++-------------------------------

      1 row in set

       

       

      Attachments

        Issue Links

          Activity

            People

              fsk119 Shengkai Fang
              jingzhang Jing Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: