[PARQUET-1901] Add filter null check for ColumnIndex - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.11.0
Fix Version/s: None
Component/s: parquet-mr
Labels:
None

Description

This Jira is opened for discussion that should we add null checking for the filter when ColumnIndex is enabled.

In the ColumnIndexFilter#calculateRowRanges() method, the input parameter 'filter' is assumed to be non-null without checking. It throws NPE when ColumnIndex is enabled(by default) but there is no filter set in the ParquetReadOptions. The call stack is as below.
java.lang.NullPointerException
at org.apache.parquet.internal.filter2.columnindex.ColumnIndexFilter.calculateRowRanges(ColumnIndexFilter.java:81)
at org.apache.parquet.hadoop.ParquetFileReader.getRowRanges(ParquetFileReader.java:961)
at org.apache.parquet.hadoop.ParquetFileReader.readNextFilteredRowGroup(ParquetFileReader.java:891)

If we don't add, the user might need to choose to call readNextRowGroup() or readFilteredNextRowGroup() accordingly based on filter existence.

Thoughts?

Attachments

Activity

People

Assignee:: Xinli Shang

Reporter:: Xinli Shang

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 22/Aug/20 20:24

Updated:: 23/Jun/24 03:31