Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.1.0
-
None
-
None
Description
Some queries would benefit from evaluating subqueries over LocalRelations eagerly. For example:
SELECT t1.part_col FROM t1 JOIN (SELECT max(part_col) m FROM t2) foo WHERE t1.part_col = foo.m
If max(part_col) could be evaluated during planning, there's an opportunity to prune all but at most one partitions from the scan of t1.
Similarly, a near-identical query with a non-scalar subquery in the WHERE clause:
SELECT * FROM t1 WHERE part_col IN (SELECT part_col FROM t2)
could be partially evaluated to eliminate some partitions, and remove the join from the plan.
Obviously all subqueries over local relations can't be evaluated during planning, but certain whitelisted aggregates could be if the input cardinality isn't too high.