Description
I seem to have stumbled upon two issues with respect to Sarg evaluation in ORC
I have created two test cases at https://github.com/shardulm94/orc/commit/b6d97cfa0325d2a14094456d338c942f61b887f2 for the same
In the first case, applying not(isNull(column)) on a column that has all null values seems to incorrectly mark the row group as needed. This is a rather benign issue though as some extra row groups are returned.
In the second case, I create a column which has only 2 potential values, either null or 1 based on whether the row index is even or odd. So all row groups are guaranteed to have both null and 1. Applying not(in(column, 1)) on this column incorrectly marks the row group as not needed. There are null values in the row group which should be matched by notIn(column, 1). This is potentially causing some row groups to be filtered out incorrectly.
Attachments
Issue Links
- links to