Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-5051

UNION query plan prevents projection push down

    XMLWordPrintableJSON

Details

    Description

      As a user with a custom Calcite adapter that does push down, I should be able to run a UNION query of statements containing joins and still get the benefit of projection push down.

      Given a query such as:

      SELECT Id
        FROM MySchema.t1
      UNION
      SELECT t3.Id
        FROM MySchema.t2
        JOIN MySchema.t3 ON (t3.Id = t2.t3_Id)
      

      I expect a resulting query plan that looks like:

      EnumerableUnion(all=[true])
        MyEnumerableConverter
          MyProject(Id=[$0])
            MyTableScan(table=[[MySchema, t1]])
        EnumerableCalc(expr#0..1=[{inputs}], Id=[$t1])
          EnumerableMergeJoin(condition=[=($0, $1)], joinType=[inner])
            EnumerableSort(sort0=[$0], dir0=[ASC])
              EnumerableCalc(expr#0..100=[{inputs}], expr#101=[CAST($t1):BIGINT NOT NULL], t3_Id0=[$t101])
                MyEnumerableConverter
                  MyTableScan(table=[[MySchema, t2]])
            EnumerableSort(sort0=[$0], dir0=[ASC])
              MyEnumerableConverter
                MyProject(Id=[$0])
                  MyTableScan(table=[[MySchema, t3]])
      

      But instead I observed:

      EnumerableUnion(all=[false])
        MyEnumerableConverter
          MyProject(Id=[$0])
            MyTableScan(table=[[MySchema, t1]])
        EnumerableCalc(expr#0..251=[{inputs}], Id=[$t102])
          EnumerableMergeJoin(condition=[=($101, $102)], joinType=[inner])
            EnumerableSort(sort0=[$101], dir0=[ASC])
              EnumerableCalc(expr#0..100=[{inputs}], expr#101=[CAST($t1):BIGINT NOT NULL], proj#0..101=[{exprs}])
                MyEnumerableConverter
                  MyTableScan(table=[[MySchema, t2]])
            EnumerableSort(sort0=[$0], dir0=[ASC])
              MyEnumerableConverter
                MyTableScan(table=[[MySchema, t3]])
      

      Note that:

      1. The EnumerableCalc node applied to the EnumerableMergeJoin goes from taking 1 expected input field to taking 251 input fields
      2. The MyProject node expected to be applied to MyTableScan(table=[[MySchema, t3]]) is missing from the observed plan
      3. Issue was observed after upgrading from 1.24 to 1.29, so may affect one or more intervening releases
      4. PR containing reproducing unit test: https://github.com/apache/calcite/pull/2747

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gramana Zachary Gramana
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m