Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-16102

Grouping sets do not conform to SQL standard


    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.3.0, 2.2.0
    • Fix Version/s: 2.3.0
    • Component/s: Operators, Parser
    • Labels:


      1. Ashutosh Chauhan realized that the implementation of GROUPING__ID in Hive was not returning values as specified by SQL standard and other execution engines.

      After digging into this, I found out that the implementation was bogus, as internally it was changing between big-endian/little-endian representation of GROUPING__ID indistinctly, and in some cases conversions in both directions were cancelling each other.

      In the documentation in https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup we can already find the problem, even if we did not spot it at first.

      The following query: SELECT key, value, GROUPING__ID, count(*) from T1 GROUP BY key, value WITH ROLLUP
      will have the following results.

      NULL NULL 0 6
      1 NULL 1 2
      1 NULL 3 1
      1 1 3 1


      Observe that value for GROUPING__ID in first row should be `3`, while for third and fourth rows, it should be `0`.


        1. HIVE-16102.patch
          15 kB
          Jesus Camacho Rodriguez
        2. HIVE-16102.02.patch
          58 kB
          Jesus Camacho Rodriguez
        3. HIVE-16102.01.patch
          17 kB
          Jesus Camacho Rodriguez

          Issue Links



              • Assignee:
                jcamachorodriguez Jesus Camacho Rodriguez
                jcamachorodriguez Jesus Camacho Rodriguez
              • Votes:
                0 Vote for this issue
                2 Start watching this issue


                • Created: