Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
2.1.0
-
None
-
None
Description
[subquery/in-subquery/in-group-by.sql TC 01.12]
Seq((1,1,1)).toDF("t1a", "t1b", "t1c").createOrReplaceTempView("t1") Seq((1,1,1)).toDF("t2a", "t2b", "t2c").createOrReplaceTempView("t2") Seq((1,1,1)).toDF("t3a", "t3b", "t3c").createOrReplaceTempView("t3") — TC 01.12 select * from t1 where t1a in (select min(t2a) from t2 where t2a = t2a and t2c >= 1 group by t2c having t2c in (select t3c from t3 group by t3c, t3b having t2b > 6 and t3b > t2b )) == Parsed Logical Plan == 'Project [*] +- 'Filter 't1a IN (list#803) : +- 'Filter 't2c IN (list#802) : : +- 'Filter (('t2b > 6) && ('t3b > 't2b)) : : +- 'Aggregate ['t3c, 't3b], ['t3c] : : +- 'UnresolvedRelation `t3` : +- 'Aggregate ['t2c], [unresolvedalias('min('t2a), None)] : +- 'Filter (('t2a = 't2a) && ('t2c >= 1)) : +- 'UnresolvedRelation `t2` +- 'UnresolvedRelation `t1` == Analyzed Logical Plan == t1a: int, t1b: int, t1c: int Project [t1a#764, t1b#765, t1c#766] +- Filter predicate-subquery#803 [(t1a#764 = min(t2a)#816)] : +- Project [min(t2a)#816] : +- !Filter predicate-subquery#802 [(t2c#781 = t3c#796) && (t2b#780 > 6) && (t3b#795 > t2b#780)] : : +- Project [t3c#796, t3b#795] : : +- Aggregate [t3c#796, t3b#795], [t3c#796, t3b#795] : : +- SubqueryAlias t3, `t3` : : +- Project [_1#790 AS t3a#794, _2#791 AS t3b#795, _3#792 AS t3c#796] : : +- LocalRelation [_1#790, _2#791, _3#792] : +- Aggregate [t2c#781], [min(t2a#779) AS min(t2a)#816, t2c#781] : +- Filter ((t2a#779 = t2a#779) && (t2c#781 >= 1)) : +- SubqueryAlias t2, `t2` : +- Project [_1#775 AS t2a#779, _2#776 AS t2b#780, _3#777 AS t2c#781] : +- LocalRelation [_1#775, _2#776, _3#777] +- SubqueryAlias t1, `t1` +- Project [_1#760 AS t1a#764, _2#761 AS t1b#765, _3#762 AS t1c#766] +- LocalRelation [_1#760, _2#761, _3#762] == Optimized Logical Plan == Project [_1#760 AS t1a#764, _2#761 AS t1b#765, _3#762 AS t1c#766] +- Join LeftSemi, (_1#760 = min(t2a)#816) :- LocalRelation [_1#760, _2#761, _3#762] +- Aggregate [t2c#781], [min(t2a#779) AS min(t2a)#816] +- Project [_1#775 AS t2a#779, _3#777 AS t2c#781] +- Join LeftSemi, (((_3#777 = t3c#796) && (_2#776 > 6)) && (t3b#795 > _2#776)) :- Filter (_3#777 >= 1) : +- LocalRelation [_1#775, _2#776, _3#777] +- Aggregate [t3c#796, t3b#795], [t3c#796, t3b#795] +- LocalRelation [t3b#795, t3c#796]
I don't see the column t2b#780 being part of the output of the (lower) Aggregate operator. Somehow the LeftSemi join for t2b > 6 is just moved down below the Aggregate over t2. This does not look right to me.
Attachments
Issue Links
- is related to
-
SPARK-18863 Output non-aggregate expressions without GROUP BY in a subquery does not yield an error
- Resolved