Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16919

Vectorization: vectorization_short_regress.q has query result differences with non-vectorized run. Vectorized unary function broken?

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • None
    • Hive
    • None

    Description

      Jason spotted a difference in the query result for vectorization_short_regress.q.out – that is when vectorization is turned off and a base .q.out file created, there are 2 differences.

      They both seem to be related to negation. For example, in the first one MAX(cint) and MAX(cint) appear earlier as columns and match non-vec and vec. So, it doesn't appear that aggregation is failing. It seems like the issue is now that the Reducer is vectorizing, a bug is exposed. So, even though MAX and MIN are the same, the expression with negation returns different results.

      19th field of the query below: Vectorized 511 vs Non-Vectorized -58

      SELECT MAX(cint),
             (MAX(cint) / -3728),
             (MAX(cint) * -3728),
             VAR_POP(cbigint),
             (-((MAX(cint) * -3728))),
             STDDEV_POP(csmallint),
             (-563 % (MAX(cint) * -3728)),
             (VAR_POP(cbigint) / STDDEV_POP(csmallint)),
             (-(STDDEV_POP(csmallint))),
             MAX(cdouble),
             AVG(ctinyint),
             (STDDEV_POP(csmallint) - 10.175),
             MIN(cint),
             ((MAX(cint) * -3728) % (STDDEV_POP(csmallint) - 10.175)),
             (-(MAX(cdouble))),
             MIN(cdouble),
             (MAX(cdouble) % -26.28),
             STDDEV_SAMP(csmallint),
             (-((MAX(cint) / -3728))),
             ((-((MAX(cint) * -3728))) % (-563 % (MAX(cint) * -3728))),
             ((MAX(cint) / -3728) - AVG(ctinyint)),
             (-((MAX(cint) * -3728))),
             VAR_SAMP(cint)
      FROM   alltypesorc
      WHERE  (((cbigint <= 197)
               AND (cint < cbigint))
              OR ((cdouble >= -26.28)
                  AND (csmallint > cdouble))
              OR ((ctinyint > cfloat)
                  AND (cstring1 RLIKE '.*ss.*'))
                 OR ((cfloat > 79.553)
                     AND (cstring2 LIKE '10%')))
      

      Column expression is: ((-((MAX(cint) * -3728))) % (-563 % (MAX(cint) * -3728))),

      -----------------------------------------------

      This is a previously existing issue and now filed as HIVE-16919: "Vectorization: vectorization_short_regress.q has query result differences with non-vectorized run"
      10th field of the query below: Non-Vectorized -6432.000015344526 vs. -Vectorized -6432.0

      Column expression is (-(cdouble)) as c4,

      Query result for vectorization_short_regress.q.out – that is when vectorization is turned off and a base .q.out file created.

      -----------------------------------------------

      10th field of the query below: Non-Vectorized -6432.000015344526 vs. Vectorized -6432.0

      Column expression is (-(cdouble)) as c4,

      SELECT   ctimestamp1,
               cstring2,
               cdouble,
               cfloat,
               cbigint,
               csmallint,
               (cbigint / 3569) as c1,
               (-257 - csmallint) as c2,
               (-6432 * cfloat) as c3,
               (-(cdouble)) as c4,
               (cdouble * 10.175) as c5,
               ((-6432 * cfloat) / cfloat) as c6,
               (-(cfloat)) as c7,
               (cint % csmallint) as c8,
               (-(cdouble)) as c9,
               (cdouble * (-(cdouble))) as c10
      FROM     alltypesorc
      WHERE    (((-1.389 >= cint)
                 AND ((csmallint < ctinyint)
                      AND (-6432 > csmallint)))
                OR ((cdouble >= cfloat)
                    AND (cstring2 <= 'a'))
                   OR ((cstring1 LIKE 'ss%')
                       AND (10.175 > cbigint)))
      

      Attachments

        Activity

          People

            mmccline Matt McCline
            mmccline Matt McCline
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: