Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-9000

LAST_VALUE Window function returns wrong results

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Invalid
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: PTF-Windowing
    • Labels:
      None

      Description

      LAST_VALUE Windowing function has been returning bad results, as far as I can tell from day 1.

      And, it seems like the tests are also asserting that LAST_VALUE gives the wrong result.

      Here's the test output:
      https://github.com/apache/hive/blob/branch-0.14/ql/src/test/results/clientpositive/windowing_navfn.q.out#L587

      The query is:

      select t, s, i, last_value(i) over (partition by t order by s) from over10k where (s = 'oscar allen' or s = 'oscar carson') and t = 10
      

      The result is:

      t              s                    i          last_value(i)
      -------------------------------------------------------
      10	oscar allen	65662	65662
      10	oscar carson	65549	65549
      

      LAST_VALUE( i ) should have returned 65549 in both records, instead it simply ends up returning i.

      Another way you can make sure LAST_VALUE is bad is to verify it's result against LEAD(i,1) over (partition by t order by s). LAST_VALUE being last value should always be more (in terms of the specified 'order by s') than the lead by 1. While this doesn't directly apply to the above query, if the result set had more rows, you would clearly see records where lead is higher than last_value which is semantically incorrect.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              mgrover Mark Grover
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: