Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
2.3.0
-
None
-
None
Description
Subqueries are not pushed down as partition filters. See following example
create table user_mayerjoh.tab (c1 string) partitioned by (c2 string) stored as parquet;
explain select * from user_mayerjoh.tab where c2 < 1;
== Physical Plan ==
(1) FileScan parquet user_mayerjoh.tabc1#22,c2#23 Batched: true, Format: Parquet, Location: PrunedInMemoryFileIndex[], PartitionCount: 0, *PartitionFilters: isnotnull(c2#23), (cast(c2#23 as int) < 1), PushedFilters: [], ReadSchema: struct<c1:string>
explain select * from user_mayerjoh.tab where c2 < (select 1);
== Physical Plan ==
+- (1) FileScan parquet user_mayerjoh.tabc1#30,c2#31 Batched: true, Format: Parquet, Location: PrunedInMemoryFileIndex[], PartitionCount: 0, *PartitionFilters: isnotnull(c2#31), PushedFilters: [], ReadSchema: struct<c1:string>
Is it possible to first execute the subquery and use the result as partition filter?
Attachments
Issue Links
- duplicates
-
SPARK-26893 Allow partition pruning with subquery filters on file source
- Resolved