Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-623

Potentially incorrect Sarg evaluation for not(in) and not(isNull)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.5.11, 1.6.4, 1.7.0
    • None
    • None

    Description

      I seem to have stumbled upon two issues with respect to Sarg evaluation in ORC

      I have created two test cases at https://github.com/shardulm94/orc/commit/b6d97cfa0325d2a14094456d338c942f61b887f2 for the same

      In the first case, applying not(isNull(column)) on a column that has all null values seems to incorrectly mark the row group as needed. This is a rather benign issue though as some extra row groups are returned.

      In the second case, I create a column which has only 2 potential values, either null or 1 based on whether the row index is even or odd. So all row groups are guaranteed to have both null and 1. Applying not(in(column, 1)) on this column incorrectly marks the row group as not needed. There are null values in the row group which should be matched by notIn(column, 1). This is potentially causing some row groups to be filtered out incorrectly.

      Attachments

        Issue Links

          Activity

            People

              omalley Owen O'Malley
              shardulm Shardul Mahadik
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: