Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12778

Having with count distinct doesn't work for special combination

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.1.0, 1.2.1
    • None
    • Hive
    • None

    Description

      There is problem for combination of count(distinct ) in having clause without count(distinct ) in select clause.
      First case returns error FAILED: SemanticException [Error 10002]: Line Invalid column reference (unexpected)
      If I add count(distinct ) to select clause result is ok (expected).

      Please run code to see it.
      Steps to reproduce:

      create table table_subquery_having_problem (id int, value int);
      insert into table table_subquery_having_problem values (1,1);
      insert into table table_subquery_having_problem values (1,2);
      insert into table table_subquery_having_problem values (1,3);
      insert into table table_subquery_having_problem values (1,4);
      insert into table table_subquery_having_problem values (1,5);
      insert into table table_subquery_having_problem values (1,6);
      insert into table table_subquery_having_problem values (1,7);
      insert into table table_subquery_having_problem values (1,8);
      insert into table table_subquery_having_problem values (1,9);
      
      select x.id from table_subquery_having_problem x
      group by x.id
      having count(distinct x.value)>1;  -- result is ERROR
      
      select x.id, count(distinct x.value) from table_subquery_having_problem x
      group by x.id
      having count(distinct x.value)>1; --result is OK
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            brejcak@centrum.cz Peter Brejcak
            Votes:
            2 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: