Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-5237

Incorrect group-by aggregation in 0.11.0

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • 0.11.0
    • None
    • None
    • None

    Description

      group by with sub queries does not correctly aggregate results in Hive 0.11.0.

      To reproduce:

      Put the file

      1,b
      2,c
      2,b
      3,a
      3,c
      4,a
      

      in HDFS, and run

      create external table abc (x int, y string) row format delimited fields terminated by ',' location '/data/';
      

      The query

      select
              x,
              count(*)
      from
      (select
              x,
              y
      from
              abc
      group by
            x,
            y
      ) a
      group by
              x;
      

      will then give the result

      2	1
      3	1
      2	1
      4	1
      3	1
      1	1
      

      instead of the correct

      1	1
      2	2
      3	2
      4	1
      

      In 0.9.0 and 0.10.0 this is all working correctly.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              bsvingen Børge Svingen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: