Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-714

When de-correlating, push join condition into subquery

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.1.0-incubating
    • Fix Version/s: 1.4.0-incubating
    • Component/s: None
    • Labels:
      None

      Description

      When decorrelating a scalar aggregate subquery, we push copies of the tables from the outer query block into the subquery but don't push the join condition, thereby creating a cartesian join. This seems to be a regression.

      Query:

      select count(*) 
          from dfs.`/Users/asinha/data/tpch-sf1/nation` n, dfs.`/Users/asinha/data/tpch-sf1/orders` o 
               where n.n_nationkey = o.o_orderkey 
               and n.n_nationkey > (select avg(ps.ps_suppkey) from dfs.`/Users/asinha/data/tpch-sf1/partsupp` ps
                                     where n.n_regionkey = ps.ps_partkey);
      

      Here's the plan on an earlier version of Calcite (I am not sure of the version number but it was before the decorrelation refactoring), Note the join between nation and orders at the leaf level has a equi-join condition.

       
      AggregateRel(group=[{}], EXPR$0=[COUNT()])
        ProjectRel($f0=[$0])
          ProjectRel($f0=[0])
            FilterRel(condition=[>($1, $5)])
              ProjectRel(*=[$0], n_nationkey=[$1], n_regionkey=[$2], *0=[$3], o_orderkey=[$4], EXPR$0=[$6])
                JoinRel(condition=[=($2, $5)], joinType=[left])
                  JoinRel(condition=[=($1, $4)], joinType=[inner])
                    EnumerableTableAccessRel(table=[[dfs, /Users/asinha/data/tpch-sf1/nation]])
                    EnumerableTableAccessRel(table=[[dfs, /Users/asinha/data/tpch-sf1/orders]])
                  AggregateRel(group=[{0}], EXPR$0=[AVG($1)])
                    ProjectRel($f0=[$1], ps_suppkey=[$0])
                      ProjectRel(ps_suppkey=[$2], $f0=[$3])
                        FilterRel(condition=[=($3, $1)])
                          JoinRel(condition=[true], joinType=[inner])
                            EnumerableTableAccessRel(table=[[dfs, /Users/asinha/data/tpch-sf1/partsupp]])
                            AggregateRel(group=[{0}])
                              ProjectRel($f0=[$2])
                                JoinRel(condition=[=($1, $4)], joinType=[inner])
                                  EnumerableTableAccessRel(table=[[dfs, /Users/asinha/data/tpch-sf1/nation]])
                                  EnumerableTableAccessRel(table=[[dfs, /Users/asinha/data/tpch-sf1/orders]])
      

      Here's the new plan (I am on version 1.1.0 but I think the plan has not changed in the latest version). Note the join between nation and orders at the leaf level does not have any join condition.

      LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
        LogicalProject($f0=[$0])
          LogicalProject($f0=[0])
            LogicalFilter(condition=[AND(=($1, $4), >($1, $5))])
              LogicalProject(*=[$0], n_nationkey=[$1], n_regionkey=[$2], *0=[$3], o_orderkey=[$4], EXPR$0=[$6])
                LogicalJoin(condition=[=($2, $5)], joinType=[left])
                  LogicalJoin(condition=[true], joinType=[inner])
                    EnumerableTableScan(table=[[dfs, /Users/asinha/data/tpch-sf1/nation]])
                    EnumerableTableScan(table=[[dfs, /Users/asinha/data/tpch-sf1/orders]])
                  LogicalAggregate(group=[{0}], EXPR$0=[AVG($1)])
                    LogicalProject(n_regionkey=[$1], ps_suppkey=[$0])
                      LogicalProject(ps_suppkey=[$2], n_regionkey=[$3])
                        LogicalFilter(condition=[=($3, $1)])
                          LogicalJoin(condition=[true], joinType=[inner])
                            EnumerableTableScan(table=[[dfs, /Users/asinha/data/tpch-sf1/partsupp]])
                            LogicalAggregate(group=[{0}])
                              LogicalProject(n_regionkey=[$2])
                                LogicalJoin(condition=[true], joinType=[inner])
                                  EnumerableTableScan(table=[[dfs, /Users/asinha/data/tpch-sf1/nation]])
                                  EnumerableTableScan(table=[[dfs, /Users/asinha/data/tpch-sf1/orders]])
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                amansinha100 Aman Sinha
                Reporter:
                amansinha100 Aman Sinha
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: