Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-18

Cannot read dictionary-encoded pages with all null values

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: parquet-mr
    • Labels:
      None

      Description

      This is issue #283. Parquet-mr will try to read the bit-width byte in DictionaryValuesReader#initPage even if the incoming offset is at the end of the byte array because there are no values.

      Here's the stack trace:

      Caused by: parquet.io.ParquetDecodingException: could not read page Page [id: 1, bytes.size=7, valueCount=100, uncompressedSize=7] in col [id] INT32
      	at parquet.column.impl.ColumnReaderImpl.readPage(ColumnReaderImpl.java:532)
      	at parquet.column.impl.ColumnReaderImpl.checkRead(ColumnReaderImpl.java:493)
      	at parquet.column.impl.ColumnReaderImpl.consume(ColumnReaderImpl.java:546)
      	at parquet.column.impl.ColumnReaderImpl.<init>(ColumnReaderImpl.java:339)
      	at parquet.column.impl.ColumnReadStoreImpl.newMemColumnReader(ColumnReadStoreImpl.java:63)
      	at parquet.column.impl.ColumnReadStoreImpl.getColumnReader(ColumnReadStoreImpl.java:58)
      	at parquet.io.RecordReaderImplementation.<init>(RecordReaderImplementation.java:265)
      	at parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:60)
      	at parquet.io.MessageColumnIO.getRecordReader(MessageColumnIO.java:74)
      	at parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:112)
      	at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:174)
      	... 29 more
      Caused by: java.io.EOFException
      	at parquet.bytes.BytesUtils.readIntLittleEndianOnOneByte(BytesUtils.java:76)
      	at parquet.column.values.dictionary.DictionaryValuesReader.initFromPage(DictionaryValuesReader.java:55)
      	at parquet.column.impl.ColumnReaderImpl.readPage(ColumnReaderImpl.java:530)
      	... 39 more
      

        Attachments

          Activity

            People

            • Assignee:
              rdblue Ryan Blue
              Reporter:
              rdblue Ryan Blue
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: