Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3975

Partition Planning rule causes query failure due to IndexOutOfBoundsException on HDFS

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      In attempting to run the extended test suite provided by MapR, there are a large number of queries that fail due to issues in the PruneScanRule and specifically the DFSPartitionLocation constructor line 31. It is likely due to issues with the code that are related to running on HDFS where this code path has apparently not been tested.

      An example test query this type of failure occurred:

      /src/drill-test-framework/resources/Functional/ctas/ctas_auto_partition/tpch0.01_multiple_partitions/data/q11.q

      Example stack trace below:

      org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: StringIndexOutOfBoundsException: String index out of range: -12
      
      
      [Error Id: f2941267-49b1-4f67-a17f-610ffb13fcb7 on ip-172-31-30-32.us-west-2.compute.internal:31010]
              at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534) ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:742) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:841) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:786) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) [drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:788) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:894) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:255) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_85]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_85]
              at java.lang.Thread.run(Thread.java:745) [na:1.7.0_85]
      Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization: Internal error: Error while applying rule PruneScanRule:Filter_On_Scan_Parquet, args [rel#43148:DrillFilterRel.LOGICAL.ANY([]).[](input=rel#43147:Subset#4.LOGICAL.ANY([]).[],condition==($0, 1)), rel#43241:DrillScanRel.LOGICAL.ANY([]).[](table=[dfs, ctasAutoPartition, tpch_multiple_partitions/lineitem_twopart_ordered2],groupscan=ParquetGroupScan [entries=[ReadEntryWithPath [path=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2]], selectionRoot=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2, numFiles=1, usedMetadataFile=false, columns=[`l_modline`, `l_moddate`]])]
              ... 4 common frames omitted
      Caused by: java.lang.AssertionError: Internal error: Error while applying rule PruneScanRule:Filter_On_Scan_Parquet, args [rel#43148:DrillFilterRel.LOGICAL.ANY([]).[](input=rel#43147:Subset#4.LOGICAL.ANY([]).[],condition==($0, 1)), rel#43241:DrillScanRel.LOGICAL.ANY([]).[](table=[dfs, ctasAutoPartition, tpch_multiple_partitions/lineitem_twopart_ordered2],groupscan=ParquetGroupScan [entries=[ReadEntryWithPath [path=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2]], selectionRoot=hdfs://ip-172-31-30-32:54310/drill/testdata/ctas_auto_partition/tpch_multiple_partitions/lineitem_twopart_ordered2, numFiles=1, usedMetadataFile=false, columns=[`l_modline`, `l_moddate`]])]
              at org.apache.calcite.util.Util.newInternal(Util.java:792) ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
              at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:251) ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
              at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
              at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
              at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
              at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545) ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213) ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248) ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:164) ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178) ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              ... 3 common frames omitted
      Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -12
              at java.lang.String.substring(String.java:1875) ~[na:1.7.0_85]
              at org.apache.drill.exec.planner.DFSPartitionLocation.<init>(DFSPartitionLocation.java:31) ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.planner.ParquetPartitionDescriptor.createPartitionSublists(ParquetPartitionDescriptor.java:126) ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.planner.AbstractPartitionDescriptor.iterator(AbstractPartitionDescriptor.java:53) ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:190) ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
              at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) ~[calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6]
              ... 13 common frames omitted
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sphillips Steven Phillips
            jnadeau Jacques Nadeau
            Rahul Kumar Challapalli Rahul Kumar Challapalli
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment