Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1802

Impala produces incorrect count(distinct xxx) result with limit clause

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.1
    • Fix Version/s: Impala 2.2
    • Component/s: None
    • Labels:

      Description

      In a partitioned data table, Impala produced the following results. We would expect all these queries to return the same number, and hope this bug can be fixed.

      select count(distinct ts) from data_table limit 10;
      1978021
      
      select count(distinct ts) from data_table limit 11;
      2176581   
      
      select count(distinct ts) from data_table limit 12;
      2374214 
      
      select count(distinct ts) from data_table limit 13;
      2572205     (correct)
      

      Compared to:

      select count(distinct ts) from data_table;
      2572205           
      
      hive -e "select count(distinct ts) from data_table limit 10;"
      2572205
      

      Thanks,
      Huifang

        Attachments

          Activity

            People

            • Assignee:
              dtsirogiannis Dimitris Tsirogiannis
              Reporter:
              huifangq_impala_0725 Huifang Qin
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: