Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
4.0.0
-
None
Description
For example:
create or replace temp view v1 as select id from range(10); create or replace temp view q1 as select * from v1 where id between 2 and 4; cache table q1; explain select * from q1; == Physical Plan == *(1) Filter ((id#51L >= 2) AND (id#51L <= 4)) +- *(1) Range (0, 10, step=1, splits=8)
Similarly:
create or replace temp view q2 as select count_if(id > 3) as cnt from v1; cache table q2; explain select * from q2; == Physical Plan == AdaptiveSparkPlan isFinalPlan=false +- HashAggregate(keys=[], functions=[count(if (NOT _common_expr_0#88) null else _common_expr_0#88)]) +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=182] +- HashAggregate(keys=[], functions=[partial_count(if (NOT _common_expr_0#88) null else _common_expr_0#88)]) +- Project [(id#86L > 3) AS _common_expr_0#88] +- Range (0, 10, step=1, splits=8)
In the output of the above explain commands, neither include an InMemoryRelation node.
The culprit seems to be the common expression ids in the With expressions used in runtime replacements for between and count_if, e.g. this code.
Attachments
Issue Links
- duplicates
-
SPARK-47511 Canonicalize With expressions by re-assigning IDs
- Resolved
- links to