Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20174

Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation Functions

    XMLWordPrintableJSON

Details

    Description

      Write new UT tests that use random data and intentional isRepeating batches to checks for NULL and Wrong Results for vectorized aggregation functions.

       

      BUGs found:

      1) AVG/VARIANCE (family) in PARTIAL1 mode was returning NULL instead of count = 0, sum = 0 (All data types).  For AVG DECIMAL, only return NULL if there was an overflow.

      2) AVG/MIN/MAX was not detecting repeated NULL correctly for the TIMESTAMP, INTERVAL_DAY_TIME, and String Family.  Eliminated redundant code.

      3) Fix incorrect calculation  for VARIANCE (family) in PARTIAL2 and FINAL modes (HIVE-18758).

      4) Fix row-mode AVG DECIMAL to enforce output type precision and scale in COMPLETE and FINAL modes.

       

      Attachments

        1. HIVE-20174.01.patch
          109 kB
          Matt McCline
        2. HIVE-20174.02.patch
          176 kB
          Matt McCline
        3. HIVE-20174.03.patch
          172 kB
          Matt McCline
        4. HIVE-20174.04.patch
          172 kB
          Matt McCline
        5. HIVE-20174.05.patch
          172 kB
          Matt McCline

        Issue Links

          Activity

            People

              mmccline Matt McCline
              mmccline Matt McCline
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: