Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-246

ArrayIndexOutOfBoundsException with Parquet write version v2

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.6.0
    • 1.8.0
    • None
    • None

    Description

      I am getting the following exception when reading a parquet file that was created using Avro WriteSupport and Parquet write version v2.0:

      Caused by: parquet.io.ParquetDecodingException: Can't read value in column [colName, rows, array, name] BINARY at value 313601 out of 428260, 1 out of 39200 in currentPage. repetition level: 0, definition level: 2
      	at parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:462)
      	at parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:364)
      	at parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:405)
      	at parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:209)
      	... 27 more
      Caused by: java.lang.ArrayIndexOutOfBoundsException
      	at parquet.column.values.deltastrings.DeltaByteArrayReader.readBytes(DeltaByteArrayReader.java:70)
      	at parquet.column.impl.ColumnReaderImpl$2$6.read(ColumnReaderImpl.java:307)
      	at parquet.column.impl.ColumnReaderImpl.readValue(ColumnReaderImpl.java:458)
      	... 30 more
      
      

      The file is quite big (500Mb) so I cannot upload it here, but possibly there is enough information in the exception message to understand the cause of error.

      Attachments

        Issue Links

          Activity

            People

              k.shaposhnikov@gmail.com Konstantin Shaposhnikov
              k.shaposhnikov@gmail.com Konstantin Shaposhnikov
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: