Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
The following query against TPCH creates 2 AggregateRels for the IN subquery....one for the Group-By and one for the DISTINCT on the same column. Since Group-by is already doing the distinct, the second AggregateRel is redundant and hurts performance.
SELECT n_name FROM nation WHERE n_regionkey IN (SELECT r_regionkey FROM region GROUP BY r_regionkey); ProjectRel(n_name=[$2]) JoinRel(condition=[=($3, $4)], joinType=[inner]) ProjectRel($f0=[$0], $f1=[$1], $f2=[$2], $f3=[$1]) EnumerableTableAccessRel(table=[[dfs, TpchSf1, nation]]) AggregateRel(group=[{0}]) AggregateRel(group=[{0}]) ProjectRel(r_regionkey=[$1]) EnumerableTableAccessRel(table=[[dfs, TpchSf1, region]])