Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-5837

RexUtil#pullFactors output's order should be deterministic even when the RexNode kind is OR

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.34.0
    • 1.36.0
    • core

    Description

      In my rule, I used the RexUtil.pullFactors to simplify the Filter's condition and I tested the rule with TPC-H q19.sql.

      When I print the plan tree with RelOptUtil.toString,sometimes the plan is :

      LogicalAggregate(group=[{}], revenue=[SUM($0)])
        LogicalProject($f0=[*($5, -(1, $6))])
          LogicalFilter(condition=[AND(=($16, $1), SEARCH($14, Sarg['AIR':CHAR(7), 'AIR REG']:CHAR(7)), =($13, 'DELIVER IN PERSON'), OR(AND(=($19, 'Brand#12'), SEARCH($22, Sarg['SM BOX':CHAR(7), 'SM CASE', 'SM PACK', 'SM PKG':CHAR(7)]:CHAR(7)), >=($4, 1), <=($4, +(1, 10)), SEARCH($21, Sarg[[1..5]])), AND(=($19, 'Brand#23'), SEARCH($22, Sarg['MED BAG':CHAR(8), 'MED BOX':CHAR(8), 'MED PACK', 'MED PKG':CHAR(8)]:CHAR(8)), >=($4, 10), <=($4, +(10, 10)), SEARCH($21, Sarg[[1..10]])), AND(=($19, 'Brand#34'), SEARCH($22, Sarg['LG BOX':CHAR(7), 'LG CASE', 'LG PACK', 'LG PKG':CHAR(7)]:CHAR(7)), >=($4, 20), <=($4, +(20, 10)), SEARCH($21, Sarg[[1..15]]))))])
            LogicalJoin(condition=[true], joinType=[inner])
              LogicalTableScan(table=[[tpch, LINEITEM]])
              LogicalTableScan(table=[[tpch, PART]])
      

      and sometimes the plan is :

      LogicalAggregate(group=[{}], revenue=[SUM($0)])
        LogicalProject($f0=[*($5, -(1, $6))])
          LogicalFilter(condition=[AND(=($16, $1), =($13, 'DELIVER IN PERSON'), SEARCH($14, Sarg['AIR':CHAR(7), 'AIR REG']:CHAR(7)), OR(AND(=($19, 'Brand#12'), SEARCH($22, Sarg['SM BOX':CHAR(7), 'SM CASE', 'SM PACK', 'SM PKG':CHAR(7)]:CHAR(7)), >=($4, 1), <=($4, +(1, 10)), SEARCH($21, Sarg[[1..5]])), AND(=($19, 'Brand#23'), SEARCH($22, Sarg['MED BAG':CHAR(8), 'MED BOX':CHAR(8), 'MED PACK', 'MED PKG':CHAR(8)]:CHAR(8)), >=($4, 10), <=($4, +(10, 10)), SEARCH($21, Sarg[[1..10]])), AND(=($19, 'Brand#34'), SEARCH($22, Sarg['LG BOX':CHAR(7), 'LG CASE', 'LG PACK', 'LG PKG':CHAR(7)]:CHAR(7)), >=($4, 20), <=($4, +(20, 10)), SEARCH($21, Sarg[[1..15]]))))])
            LogicalJoin(condition=[true], joinType=[inner])
              LogicalTableScan(table=[[tpch, LINEITEM]])
              LogicalTableScan(table=[[tpch, PART]])
      

      The above two SQL Plan semantics are the same, the only difference is that the Filter conditions are in different order.
      Although this has no effect on the SQL execution results, it is difficult for me to monitor my plan because of the variability of the plan.

      Attachments

        1. image-2023-08-02-10-08-41-599.png
          180 kB
          LakeShen
        2. image-2023-08-02-10-08-06-390.png
          404 kB
          LakeShen

        Issue Links

          Activity

            People

              shenlang LakeShen
              shenlang LakeShen
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: