Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3966

Metadata Cache + Partition Pruning not hapenning when the partition column is of type boolean

    XMLWordPrintableJSON

Details

    Description

      git.commit.id.abbrev=19b4b79

      I have partitioned parquet files whose partition column is of type boolean.
      The below plan suggests that pruning did not take place when partitioned column is of type boolean and when metadata exists. However if I get rid of the metadata cache, partition pruning seems to be working fine.

      Query :

      explain plan for select * from fewtypes_boolpartition where bool_col = false;
      
      00-00    Screen
      00-01      Project(*=[$0])
      00-02        Project(T11¦¦*=[$0])
      00-03          SelectionVectorRemover
      00-04            Filter(condition=[=($1, false)])
      00-05              Project(T11¦¦*=[$0], bool_col=[$1])
      00-06                Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_2.parquet], ReadEntryWithPath [path=maprfs:///drill/testdata/metadata_caching/fewtypes_boolpartition/0_0_1.parquet]], selectionRoot=/drill/testdata/metadata_caching/fewtypes_boolpartition, numFiles=2, usedMetadataFile=true, columns=[`*`]]])
      
      

      Error from the log :

      WARN  o.a.d.e.p.l.partition.PruneScanRule - Exception while trying to prune partition.
       java.lang.UnsupportedOperationException: Unsupported type: BIT
       	at org.apache.drill.exec.store.parquet.ParquetGroupScan.populatePruningVector(ParquetGroupScan.java:451) ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
       	at org.apache.drill.exec.planner.ParquetPartitionDescriptor.populatePartitionVectors(ParquetPartitionDescriptor.java:96) ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
       	at org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:212) ~[drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
       	at org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
       	at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
       	at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
       	at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
       	at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
       	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
       	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
       	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
       	at org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan(ExplainHandler.java:61) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
       	at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
       	at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
       	at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
       	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71]
       	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71]
       	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
      

      I attached the data sets required. Let me know if you need anything

      Attachments

        1. 0_0_1.parquet
          2 kB
          Rahul Kumar Challapalli
        2. 0_0_2.parquet
          2 kB
          Rahul Kumar Challapalli

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rkins Rahul Kumar Challapalli
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: