Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
git.commit.id.abbrev=19b4b79
I have partitioned parquet files whose partition column is of type boolean.
The below plan suggests that pruning did not take place when partitioned column is of type boolean and when metadata exists. However if I get rid of the metadata cache, partition pruning seems to be working fine.
Query :
explain plan for select * from fewtypes_boolpartition where bool_col = false; 00-00 Screen 00-01 Project(*=[$0]) 00-02 Project(T11¦¦*=[$0]) 00-03 SelectionVectorRemover 00-04 Filter(condition=[=($1, false)]) 00-05 Project(T11¦¦*=[$0], bool_col=[$1]) 00-06 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_2.parquet], ReadEntryWithPath [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_1.parquet]], selectionRoot=/drill/testdata/metadata_caching/fewtypes_boolpartition, numFiles=2, usedMetadataFile=true, columns=[`*`]]])
Error from the log :
WARN o.a.d.e.p.l.partition.PruneScanRule - Exception while trying to prune partition. java.lang.UnsupportedOperationException: Unsupported type: BIT at org.apache.drill.exec.store.parquet.ParquetGroupScan.populatePruningVector(ParquetGroupScan.java:451) ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.ParquetPartitionDescriptor.populatePartitionVectors(ParquetPartitionDescriptor.java:96) ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:212) ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan(ExplainHandler.java:61) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
I attached the data sets required. Let me know if you need anything
Attachments
Attachments
Issue Links
- duplicates
-
DRILL-4139 Fix parquet partition pruning for BIT, INTERVAL and DECIMAL types
- Resolved