Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.4.2, 4.0.0, 3.5.1
Description
For example:
CREATE or REPLACE TEMP VIEW t1(c1, c2) AS VALUES (0, 1), (1, 2); CREATE or REPLACE TEMP VIEW t2(c1, c2) AS VALUES (0, 1), (1, 2); create or replace temp view v1 as select * from t1 join lateral ( select c1 as a, c2 as b from t2) on c1 = a; cache table v1; explain select * from v1; == Physical Plan == AdaptiveSparkPlan isFinalPlan=false +- BroadcastHashJoin [c1#180], [a#173], Inner, BuildRight, false :- LocalTableScan [c1#180, c2#181] +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [plan_id=113] +- LocalTableScan [a#173, b#174]
Note that there is no InMemoryRelation.
However, if you move the join condition into the subquery, the cached plan is used:
CREATE or REPLACE TEMP VIEW t1(c1, c2) AS VALUES (0, 1), (1, 2); CREATE or REPLACE TEMP VIEW t2(c1, c2) AS VALUES (0, 1), (1, 2); create or replace temp view v2 as select * from t1 join lateral ( select c1 as a, c2 as b from t2 where t1.c1 = t2.c1); cache table v2; explain select * from v2; == Physical Plan == AdaptiveSparkPlan isFinalPlan=false +- Scan In-memory table v2 [c1#176, c2#177, a#178, b#179] +- InMemoryRelation [c1#176, c2#177, a#178, b#179], StorageLevel(disk, memory, deserialized, 1 replicas) +- AdaptiveSparkPlan isFinalPlan=true +- == Final Plan == *(1) Project [c1#26, c2#27, a#19, b#20] +- *(1) BroadcastHashJoin [c1#26], [c1#30], Inner, BuildLeft, false :- BroadcastQueryStage 0 : +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [plan_id=37] : +- LocalTableScan [c1#26, c2#27] +- *(1) LocalTableScan [a#19, b#20, c1#30] +- == Initial Plan == Project [c1#26, c2#27, a#19, b#20] +- BroadcastHashJoin [c1#26], [c1#30], Inner, BuildLeft, false :- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [plan_id=37] : +- LocalTableScan [c1#26, c2#27] +- LocalTableScan [a#19, b#20, c1#30]