Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-5953

AggregateCaseToFilterRule may make inaccurate SUM transformation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.36.0
    • None

    Description

      consider: sum(case when x = 1 then 2 else 0 end) as b
      notice that this expression may only be null if there are no rows in the table

      AggregateCaseToFilterRule rewrites the above expression to sum(2) filter (where x=1) which broadens when it could be `null` to when there are no matches to the filter

      • A is 0 correctly in this case; but I think it will be still 0 in case there are 0 input rows
      • The result for B supposed to be 0 but since there are no matches by the filter it becomes null
      • C is not touched
      # Convert CASE to FILTER without matches
      select  sum(case when x = 1 then 1 else 0 end) as a,
              sum(case when x = 1 then 2 else 0 end) as b,
              sum(case when x = 1 then 3 else -1 end) as c
      from (values 0, null, 0, 2) as t(x);
      +---+---+----+
      | A | B | C  |
      +---+---+----+
      | 0 |   | -4 | # incorrect result for B
      | 0 | 0 | -4 | # correct
      +---+---+----+
      (1 row)
      
      !ok
      EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], B=[$t1], C=[$t2])
        EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER $3], C=[SUM($0)])
          EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6])
            EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]])
      !plan
      
      # Convert CASE to FILTER with no input rows
      select  sum(case when x = 1 then 1 else 0 end) as a,
              sum(case when x = 1 then 2 else 0 end) as b,
              sum(case when x = 1 then 3 else -1 end) as c
      from (values 0, null, 0, 2) as t(x) where x*x=1;
      +---+---+---+
      | A | B | C |
      +---+---+---+
      | 0 |   |   | # incorrect results - for 0 input rows sum supposed to be null
      |   |   |   | # correct result
      +---+---+---+
      (1 row)
      
      !ok
      
      

      Attachments

        Issue Links

          Activity

            People

              kgyrtkirk Zoltan Haindrich
              kgyrtkirk Zoltan Haindrich
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: