Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-27876

Incorrect query results on tables with ClusterBy & SortBy

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.0.0
    • None

    Description

      Repro:

       

      create external table test_bucket(age int, name string, dept string) clustered by (age, name) sorted by (age asc, name asc) into 2 buckets stored as orc;
      insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2');
      insert into test_bucket values (1, 'user1', 'dept1'), ( 2, 'user2' , 'dept2');
      
      //empty wrong results
      select age, name, count(*) from test_bucket group by  age, name having count(*) > 1; 
      +------+-------+------+
      | age  | name  | _c2  |
      +------+-------+------+
      +------+-------+------+
      
      // Workaround
      set hive.map.aggr=false;
      select age, name, count(*) from test_bucket group by  age, name having count(*) > 1; 
      +------+--------+------+
      | age  |  name  | _c2  |
      +------+--------+------+
      | 1    | user1  | 2    |
      | 2    | user2  | 2    |
      +------+--------+------+ 

       

       

      Attachments

        Issue Links

          Activity

            People

              rameshkumar Ramesh Kumar Thangarajan
              nareshpr Naresh P R
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: