Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-26733

Not safe to use '=' for predicates on constant expressions that might be NULL

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0-alpha-1
    • None
    • CBO

    Description

      HiveRelMdPredicates was forked from Calcite's RelMdPredicates long time ago.

      Hive's version lacks this commit https://github.com/apache/calcite/commit/8281668f which introduced the use of "IS NOT DISTINCT FROM" in place of "EQUAL" when a constant expression can be NULL.

      There is no Calcite ticket for this change, so I am briefly explaining the issue here.

      Consider the following input as argument of HiveRelMdPredicates::pullUpPredicates(Project) method:

      SELECT char_length(NULL) FROM t

      The method currently infers the predicate (=($0, CHAR_LENGTH(null:NULL))) which translates to "=(NULL, NULL)", which turns simplifies to FALSE under the unknownAsFalse semantics.

      The change will make this methods return "IS NOT DISTINCT FROM($0, CHAR_LENGTH(null:NULL))", which translates to IS NOT DISTINCT FROM(NULL, NULL), which is TRUE.

      For reference, we have the truth table below (from [1]):

      A B A = B A IS NOT DISTINCT FROM B
      0 0 true true
      0 1 false false
      0 null unknown false
      null null unknown true

      [1] https://modern-sql.com/feature/is-distinct-from

      Attachments

        Issue Links

          Activity

            People

              asolimando Alessandro Solimando
              asolimando Alessandro Solimando
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m