Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-14900

output contains null value when a is null in pattern where a <> 'String'

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 1.9.1
    • Fix Version/s: 1.10.0
    • Component/s: Table SQL / Planner
    • Labels:
      None

      Description

      the result of TPC-DS query 19 is incorrect, I analyze the simplify the sql as following:

      // simplify query 19
      select top 100 ca_zip,s_zip,substring(ca_zip,1,5), substring(s_zip,1,5) 
       from customer_address,store
         where substring(ca_zip,1,5) <> substring(s_zip,1,5) 
        order by ca_zip,s_zip

       

      flink output:

      null|31904|null|31904
      null|31904|null|31904
      null|31904|null|31904
      null|31904|null|31904

      SQL server output:

      00601      31904      00601 31904
      00601      31904      00601 31904
      00601      31904      00601 31904
      00601      31904      00601 31904
      

      This issue may influent the result of tpcds query 46、68.

      Another similar pattern "where a = b" in query31,  it'll will bring null value too.

       

      // simplify query 31
      select 
              ss1.ca_county
             ,ss1.d_year
             ,ss2.store_sales/ss1.store_sales store_q1_q2_increase   
       from
              ss ss1
             ,ss ss2  
       where
          ss1.d_qoy = 1
          and ss1.d_year = 2000
          and ss1.ca_county = ss2.ca_county
          and ss2.d_qoy = 2
          and ss2.d_year = 2000
        order by ss1.ca_county

       

      Flink output :

       

      |2000|0.845635
      Acadia Parish|2000|1.362160
      Accomack County|2000|0.650251
      

       

      SQL server output:

       

      Acadia Parish|2000|1.362159
      Accomack County|2000|0.650250
      

       

       

      I tested expression test and sql ITcase,they will return correct value.

      @Test
      def testSubStr(): Unit = {
        checkQuery(
          Seq[(String, String, Double)]((null, "00601" , 100d), ("00501", "00601", 50d)),
          "select f0, substring(f0,1,5) from  Table1 where substring(f0,1,5)  <> f1 ",
          Seq(("00501", "00501"))
        )
      }
      

      So, I need to dig more about this issue. 

       

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Leonard Xu Leonard Xu
                Reporter:
                Leonard Xu Leonard Xu
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: