Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-244

DeltaByteArrayReader fails with ArrayIndexOutOfBoundsException when moving across pages

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 1.6.0
    • None
    • parquet-mr
    • None

    Description

      DeltaByteArrayReader.readBytes() fails with ArrayIndexOutOfBoundsException soon after it has processed a new page via initFromPage(). This issue can be reproduced by trying to read a Binary column that is encoded using delta byte array and spans multiple pages.

      This is happening because ColumnReaderImpl.initDataReader() creates a new ValueReader every time a new page is processed (see this.dataColumn = dataEncoding.getValuesReader(path, VALUES)). The DeltaByteArrayReader is stateful and needs to remember the previous Binary value that was read across pages. When a new DeltaByteArrayReader is created, this information is lost.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              aloshbennett Alosh Bennett
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: