Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24908

Adding Respect/Ignore nulls as a UDAF parameter is ambiguous

    XMLWordPrintableJSON

Details

    Description

      Both function calls translated to the same UDAF call:

      SELECT lead(a, 2, true) ...
      SELECT lead(a, 2) IGNORE NULLS ...
      

      IGNORE NULLS is passed as an extra constant boolean parameter to the UDAF
      https://github.com/apache/hive/blob/eed78dfdcb6dfc2de400397a60de12e6f62b96e2/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/ASTConverter.java#L743

      However the semantics of the two function calls has different semantics:

      • lead(a, 2, true) - 'true' is the default value: "The value of DEFAULT is returned as the result if there is no row corresponding to the OFFSET number of rows before R within P (for the lag function) or after R within P (for the lead function)"
      • lead(a, 2) IGNORE NULLS - For each row in the current window find the 2nd not-NULL value starting directly after the current row.

      Attachments

        Issue Links

          Activity

            People

              kkasa Krisztian Kasa
              kkasa Krisztian Kasa
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m