Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4770

ParquetRecordReader throws NPE querying a single int64 column file

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.8.0
    • Fix Version/s: None
    • Component/s: Storage - Parquet
    • Labels:
      None

      Description

      I have a parquet file with a single int64 column.

      [root@perfnode166 parquet-mr]# java -jar parquet-tools/target/parquet-tools-1.8.2-SNAPSHOT.jar dump /mapr/drill50.perf.lab/drill/testdata/parquet_storage/int64_10_bs10k_ps1k_uncompressed.parquet
      row group 0
      --------------------------------------------------------------------------------
      int64_field_required:  INT64 UNCOMPRESSED DO:0 FPO:4 SZ:55/55/1.00 VC:10 [more]...
      
          int64_field_required TV=10 RL=0 DL=0
          ----------------------------------------------------------------------------
          page 0:  DLE:RLE RLE:RLE VLE:DELTA_BINARY_PACKED ST:[min: 0, max:  [more]... VC:10
      
      INT64 int64_field_required
      --------------------------------------------------------------------------------
      *** row group 1 of 1, values 1 to 10 ***
      value 1:  R:0 D:0 V:0
      value 2:  R:0 D:0 V:1
      value 3:  R:0 D:0 V:2
      value 4:  R:0 D:0 V:3
      value 5:  R:0 D:0 V:4
      value 6:  R:0 D:0 V:5
      value 7:  R:0 D:0 V:6
      value 8:  R:0 D:0 V:7
      value 9:  R:0 D:0 V:8
      value 10: R:0 D:0 V:9
      

      Drill version:

      0: jdbc:drill:schema=dfs.drillTestDir> select * from sys.version;
      +-----------------+-------------------------------------------+-----------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+----------------------------+
      |     version     |                 commit_id                 |                                                 commit_message                                                  |        commit_time         |     build_email     |         build_time         |
      +-----------------+-------------------------------------------+-----------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+----------------------------+
      | 1.8.0-SNAPSHOT  | 05c42eae79ce3e309028b3824f9449b98e329f29  | DRILL-4707: Fix memory leak or incorrect query result in case two column names are case-insensitive identical.  | 29.06.2016 @ 08:15:13 PDT  | inramana@gmail.com  | 07.07.2016 @ 10:50:40 PDT  |
      +-----------------+-------------------------------------------+-----------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+----------------------------+
      1 row selected (0.44 seconds)
      

      drill throws NPE:

      2016-07-08 11:08:55,156 [288013c7-f122-f6be-936e-c18ebe9b92ef:foreman] INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query id 288013c7-f122-f6be-936e-c18ebe9b92ef: select * from dfs.`drill/testdata/parquet_storage/int64_10_bs10k_ps1k_uncompressed.parquet`
      2016-07-08 11:08:55,292 [288013c7-f122-f6be-936e-c18ebe9b92ef:foreman] INFO  o.a.d.exec.store.parquet.Metadata - Took 0 ms to get file statuses
      2016-07-08 11:08:55,295 [288013c7-f122-f6be-936e-c18ebe9b92ef:foreman] INFO  o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 1 using 1 threads. Time: 2ms total, 2.423069ms avg, 2ms max.
      2016-07-08 11:08:55,295 [288013c7-f122-f6be-936e-c18ebe9b92ef:foreman] INFO  o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 1 using 1 threads. Earliest start: 1.347000 μs, Latest start: 1.347000 μs, Average start: 1.347000 μs .
      2016-07-08 11:08:55,295 [288013c7-f122-f6be-936e-c18ebe9b92ef:foreman] INFO  o.a.d.exec.store.parquet.Metadata - Took 2 ms to read file metadata
      2016-07-08 11:08:55,377 [288013c7-f122-f6be-936e-c18ebe9b92ef:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor - 288013c7-f122-f6be-936e-c18ebe9b92ef:0:0: State change requested AWAITING_ALLOCATION --> RUNNING
      2016-07-08 11:08:55,377 [288013c7-f122-f6be-936e-c18ebe9b92ef:frag:0:0] INFO  o.a.d.e.w.f.FragmentStatusReporter - 288013c7-f122-f6be-936e-c18ebe9b92ef:0:0: State to report: RUNNING
      2016-07-08 11:08:55,386 [288013c7-f122-f6be-936e-c18ebe9b92ef:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor - 288013c7-f122-f6be-936e-c18ebe9b92ef:0:0: State change requested RUNNING --> FAILED
      2016-07-08 11:08:55,386 [288013c7-f122-f6be-936e-c18ebe9b92ef:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor - 288013c7-f122-f6be-936e-c18ebe9b92ef:0:0: State change requested FAILED --> FINISHED
      2016-07-08 11:08:55,387 [288013c7-f122-f6be-936e-c18ebe9b92ef:frag:0:0] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException
      
      Fragment 0:0
      
      [Error Id: 21fcc35b-6151-46b6-a750-0ce6f2141a7d on 10.10.30.167:31010]
      org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: NullPointerException
      
      Fragment 0:0
      
      [Error Id: 21fcc35b-6151-46b6-a750-0ce6f2141a7d on 10.10.30.167:31010]
      	at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543) ~[drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:318) [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:185) [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:287) [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45]
      	at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
      Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Error in parquet record reader.
      Message:
      Hadoop path: /drill/testdata/parquet_storage/int64_10_bs10k_ps1k_uncompressed.parquet
      Total records read: 0
      Mock records read: 0
      Records to read: 10
      Row group index: 0
      Records in row group: 10
      Parquet Metadata: ParquetMetaData{FileMetaData{schema: message test {
        required int64 int64_field_required;
      }
      , metadata: {writer.model.name=example}}, blocks: [BlockMetaData{10, 55 [ColumnMetaData{UNCOMPRESSED [int64_field_required] INT64  [DELTA_BINARY_PACKED], 4}]}]}
      	at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleAndRaise(ParquetRecordReader.java:352) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:454) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:178) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:135) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:81) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:257) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:251) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at java.security.AccessController.doPrivileged(Native Method) ~[na:1.7.0_45]
      	at javax.security.auth.Subject.doAs(Subject.java:415) ~[na:1.7.0_45]
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595) ~[hadoop-common-2.7.0-mapr-1602.jar:na]
      	at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:251) [drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	... 4 common frames omitted
      Caused by: java.lang.NullPointerException: null
      	at org.apache.drill.exec.store.parquet.columnreaders.PageReader.next(PageReader.java:241) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.readPage(ColumnReader.java:198) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.determineSize(ColumnReader.java:141) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.processPages(ColumnReader.java:107) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.readAllFixedFields(ParquetRecordReader.java:393) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:436) ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
      	... 19 common frames omitted
      2016-07-08 11:08:55,412 [CONTROL-rpc-event-queue] WARN  o.a.drill.exec.work.foreman.Foreman - Dropping request to move to COMPLETED state as query is already at FAILED state (which is terminal).
      2016-07-08 11:08:55,413 [CONTROL-rpc-event-queue] WARN  o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel fragment. 288013c7-f122-f6be-936e-c18ebe9b92ef:0:0 does not exist.
      

        Attachments

          Activity

            People

            • Assignee:
              ppenumarthy Padma Penumarthy
              Reporter:
              cchang@maprtech.com Chun Chang
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated: