Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-341

Redundant AggregateRel for IN subquery

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.9.1-incubating
    • None
    • None

    Description

      The following query against TPCH creates 2 AggregateRels for the IN subquery....one for the Group-By and one for the DISTINCT on the same column. Since Group-by is already doing the distinct, the second AggregateRel is redundant and hurts performance.

      SELECT n_name FROM nation 
        WHERE n_regionkey IN (SELECT r_regionkey FROM region 
                                                  GROUP BY r_regionkey);
      
      ProjectRel(n_name=[$2])
        JoinRel(condition=[=($3, $4)], joinType=[inner])
          ProjectRel($f0=[$0], $f1=[$1], $f2=[$2], $f3=[$1])
            EnumerableTableAccessRel(table=[[dfs, TpchSf1, nation]])
          AggregateRel(group=[{0}])
            AggregateRel(group=[{0}])
              ProjectRel(r_regionkey=[$1])
                EnumerableTableAccessRel(table=[[dfs, TpchSf1, region]])
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            julianhyde Julian Hyde
            amansinha100 Aman Sinha
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment