Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6685

Error in parquet record reader

Attach filesAttach ScreenshotVotersStop watchingWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      This is the query:
      select VarbinaryValue1 from dfs.`/drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet` limit 36;

      It appears to be caused by this commit:
      DRILL-6570: Fixed IndexOutofBoundException in Parquet Reader
      aee899c1b26ebb9a5781d280d5a73b42c273d4d5

      This is the stack trace:

      Error: INTERNAL_ERROR ERROR: Error in parquet record reader.
      Message: 
      Hadoop path: /drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet/0_0_0.parquet
      Total records read: 0
      Row group index: 0
      Records in row group: 1250
      Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {
        optional int64 Index;
        optional binary VarbinaryValue1;
        optional int64 BigIntValue;
        optional boolean BooleanValue;
        optional int32 DateValue (DATE);
        optional float FloatValue;
        optional binary VarcharValue1 (UTF8);
        optional double DoubleValue;
        optional int32 IntegerValue;
        optional int32 TimeValue (TIME_MILLIS);
        optional int64 TimestampValue (TIMESTAMP_MILLIS);
        optional binary VarbinaryValue2;
        optional fixed_len_byte_array(12) IntervalYearValue (INTERVAL);
        optional fixed_len_byte_array(12) IntervalDayValue (INTERVAL);
        optional fixed_len_byte_array(12) IntervalSecondValue (INTERVAL);
        optional binary VarcharValue2 (UTF8);
      }
      , metadata: {drill-writer.version=2, drill.version=1.14.0-SNAPSHOT}}, blocks: [BlockMetaData{1250, 23750308 [ColumnMetaData{UNCOMPRESSED [Index] optional int64 Index  [PLAIN, RLE, BIT_PACKED], 4}, ColumnMetaData{UNCOMPRESSED [VarbinaryValue1] optional binary VarbinaryValue1  [PLAIN, RLE, BIT_PACKED], 10057}, ColumnMetaData{UNCOMPRESSED [BigIntValue] optional int64 BigIntValue  [PLAIN, RLE, BIT_PACKED], 8174655}, ColumnMetaData{UNCOMPRESSED [BooleanValue] optional boolean BooleanValue  [PLAIN, RLE, BIT_PACKED], 8179722}, ColumnMetaData{UNCOMPRESSED [DateValue] optional int32 DateValue (DATE)  [PLAIN, RLE, BIT_PACKED], 8179916}, ColumnMetaData{UNCOMPRESSED [FloatValue] optional float FloatValue  [PLAIN, RLE, BIT_PACKED], 8184959}, ColumnMetaData{UNCOMPRESSED [VarcharValue1] optional binary VarcharValue1 (UTF8)  [PLAIN, RLE, BIT_PACKED], 8190002}, ColumnMetaData{UNCOMPRESSED [DoubleValue] optional double DoubleValue  [PLAIN, RLE, BIT_PACKED], 10230058}, ColumnMetaData{UNCOMPRESSED [IntegerValue] optional int32 IntegerValue  [PLAIN, RLE, BIT_PACKED], 10240111}, ColumnMetaData{UNCOMPRESSED [TimeValue] optional int32 TimeValue (TIME_MILLIS)  [PLAIN, RLE, BIT_PACKED], 10245154}, ColumnMetaData{UNCOMPRESSED [TimestampValue] optional int64 TimestampValue (TIMESTAMP_MILLIS)  [PLAIN, RLE, BIT_PACKED], 10250197}, ColumnMetaData{UNCOMPRESSED [VarbinaryValue2] optional binary VarbinaryValue2  [PLAIN, RLE, BIT_PACKED], 10260250}, ColumnMetaData{UNCOMPRESSED [IntervalYearValue] optional fixed_len_byte_array(12) IntervalYearValue (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19632385}, ColumnMetaData{UNCOMPRESSED [IntervalDayValue] optional fixed_len_byte_array(12) IntervalDayValue (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19647446}, ColumnMetaData{UNCOMPRESSED [IntervalSecondValue] optional fixed_len_byte_array(12) IntervalSecondValue (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19662507}, ColumnMetaData{UNCOMPRESSED [VarcharValue2] optional binary VarcharValue2 (UTF8)  [PLAIN, RLE, BIT_PACKED], 19677568}]}]}
      
      Fragment 0:0
      
      [Error Id: 25852cdb-3217-4041-9743-66e9f3a2fbe4 on qa-node186.qa.lab:31010] (state=,code=0)
      

      Table can be found in 10.10.100.186:/tmp/fourvarchar_asc_nulls_16MB.parquet

      sys.version is:
      1.15.0-SNAPSHOT a05f17d6fcd80f0d21260d3b1074ab895f457bac Changed PROJECT_OUTPUT_BATCH_SIZE to System + Session 30.07.2018 @ 17:12:53 PDT rhou@mapr.com 30.07.2018 @ 17:25:21 PDT^M

      fourvarchar_asc_nulls70.q

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sachouche Salim Achouche
            rhou Robert Hou
            Boaz Ben-Zvi Boaz Ben-Zvi
            Votes:
            0 Vote for this issue
            Watchers:
            4 Stop watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment