Description
Correlated subqueries with LIMIT could return incorrect results. The rule ResolveSubquery in the Analysis phase moves correlated predicates to a join predicates and neglect the semantic of the LIMIT.
Example:
Seq(1, 2).toDF("c1").createOrReplaceTempView("t1") Seq(1, 2).toDF("c2").createOrReplaceTempView("t2") sql("select c1 from t1 where exists (select 1 from t2 where t1.c1=t2.c2 LIMIT 1)").show +---+ | c1| +---+ | 1| +---+
The correct result contains both rows from T1.
Attachments
Issue Links
- relates to
-
SPARK-18455 General support for correlated subquery processing
- Resolved
- links to