Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1802

Impala produces incorrect count(distinct xxx) result with limit clause

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.1
    • Impala 2.2
    • None

    Description

      In a partitioned data table, Impala produced the following results. We would expect all these queries to return the same number, and hope this bug can be fixed.

      select count(distinct ts) from data_table limit 10;
      1978021
      
      select count(distinct ts) from data_table limit 11;
      2176581   
      
      select count(distinct ts) from data_table limit 12;
      2374214 
      
      select count(distinct ts) from data_table limit 13;
      2572205     (correct)
      

      Compared to:

      select count(distinct ts) from data_table;
      2572205           
      
      hive -e "select count(distinct ts) from data_table limit 10;"
      2572205
      

      Thanks,
      Huifang

      Attachments

        Activity

          People

            dtsirogiannis Dimitris Tsirogiannis
            huifangq_impala_0725 Huifang Qin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: