Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.4.0
Description
They are not handled in decorrelation, and we also don’t have any checks to block them, so these queries have outer references in the query plan leading to internal errors:
CREATE TEMP VIEW x(x1, x2) AS VALUES (0, 1), (1, 2);
CREATE TEMP VIEW y(y1, y2) AS VALUES (0, 2), (0, 3);
CREATE TEMP VIEW z(z1, z2) AS VALUES (0, 2), (0, 3);
select * from x left join y on x1 = y1 and exists (select * from z where z1 = x1)
Error occurred during query planning:
org.apache.spark.sql.catalyst.plans.logical.Filter cannot be cast to org.apache.spark.sql.execution.SparkPlan
PullupCorrelatedPredicates#apply and RewritePredicateSubquery only handle subqueries in UnaryNode, it seems to assume that they cannot occur elsewhere, like in a join ON condition.
We will need to decide whether to block them properly in analysis (i.e. give a proper error for them), or see if we can add support for them.
Also note, scalar subqueries in the ON condition are unsupported too but return a proper error.