Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-1901

Add filter null check for ColumnIndex

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.11.0
    • None
    • parquet-mr
    • None

    Description

      This Jira is opened for discussion that should we add null checking for the filter when ColumnIndex is enabled.

      In the ColumnIndexFilter#calculateRowRanges() method, the input parameter 'filter' is assumed to be non-null without checking. It throws NPE when ColumnIndex is enabled(by default) but there is no filter set in the ParquetReadOptions. The call stack is as below.
      java.lang.NullPointerException
      at org.apache.parquet.internal.filter2.columnindex.ColumnIndexFilter.calculateRowRanges(ColumnIndexFilter.java:81)
      at org.apache.parquet.hadoop.ParquetFileReader.getRowRanges(ParquetFileReader.java:961)
      at org.apache.parquet.hadoop.ParquetFileReader.readNextFilteredRowGroup(ParquetFileReader.java:891)

      If we don't add, the user might need to choose to call readNextRowGroup() or readFilteredNextRowGroup() accordingly based on filter existence.

      Thoughts?

      Attachments

        Activity

          People

            shangx@uber.com Xinli Shang
            shangx@uber.com Xinli Shang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: