Uploaded image for project: 'Parquet'
  1. Parquet
  2. PARQUET-2244

Dictionary filter may skip row-groups incorrectly when evaluating notIn

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.12.2
    • 1.13.0
    • parquet-mr
    • None

    Description

      Dictionary filter may skip row-groups incorrectly when evaluating `notIn` on optional columns with null values. Here is an example:

      Say there is a optional column `c1` with all pages dict encoded, `c1` has and only has two distinct values: ['foo', null],  and the predicate is  `c1 not in ('foo', 'bar')`. 

      Now dictionary filter may skip this row-group that is actually should not be skipped, because there are nulls in the column.

       

      This is a bug similar to #1510.

      Attachments

        Activity

          People

            zhongyuj Yujiang Zhong
            zhongyuj Yujiang Zhong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: