Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-1414

Slow query without codegen

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Cannot Reproduce
    • Impala 2.0
    • None
    • Perf Investigation

    Description

      The following query on a relatively small input takes ~70s with codegen and ~35 minutes without codegen, which seems unreasonably long. We should make sure we're not doing anything crazy in the non-codegen eval path.

      Executing with codegen:

      use functional;
      select sum(t1.int_col * t1.id) OVER (
      ORDER BY t1.int_col * t1.id ASC, t3.year + t3.tinyint_col ASC ROWS
      BETWEEN 66 PRECEDING AND 21 FOLLOWING) AS int_col_1
      FROM alltypes t1
      INNER JOIN alltypes t2 ON t2.month = t1.smallint_col
      INNER JOIN alltypes t3 ON t3.bigint_col = t2.bigint_col
      WHERE
      t2.id <= t3.smallint_col + t2.tinyint_col
      order by 1
      limit 5;
      
      Operator              #Hosts   Avg Time   Max Time  #Rows  Est. #Rows   Peak Mem  Est. Peak Mem  Detail                  
      -------------------------------------------------------------------------------------------------------------------------
      07:TOP-N                   1    1s578ms    1s578ms      5           5   16.02 MB        -1.00 B                          
      06:ANALYTIC                1   12s179ms   12s179ms  5.33M       3.89B   32.25 MB        -1.00 B                          
      11:MERGING-EXCHANGE        1  615.834ms  615.834ms  5.33M       3.89B          0        -1.00 B  UNPARTITIONED           
      05:SORT                    3    9s857ms   29s572ms  5.33M       3.89B  256.03 MB        1.09 GB                          
      04:HASH JOIN               3   26s717ms   27s350ms  5.33M       3.89B    4.07 MB      117.63 KB  INNER JOIN, BROADCAST   
      |--10:EXCHANGE             3    1.682ms    1.841ms  7.30K       7.30K          0              0  BROADCAST               
      |  02:SCAN HDFS            3  238.331ms  270.319ms  7.30K       7.30K  329.00 KB      160.00 MB  functional.alltypes t3  
      03:HASH JOIN               3   34.448ms   41.592ms  3.99M       5.33M    4.10 MB       26.14 KB  INNER JOIN, PARTITIONED 
      |--09:EXCHANGE             3  392.297us  497.645us  7.30K       7.30K          0              0  HASH(t1.smallint_col)   
      |  00:SCAN HDFS            3  232.237ms   240.65ms  7.30K       7.30K  329.00 KB      160.00 MB  functional.alltypes t1  
      08:EXCHANGE                3  347.748us   381.88us  7.30K       7.30K          0              0  HASH(t2.month)          
      01:SCAN HDFS               3  191.334ms  237.188ms  7.30K       7.30K  385.00 KB      160.00 MB  functional.alltypes t2
      

      The query profile when running without codegen is attached. I also used perf (linux profiling tool) to collect a profile (sampled stacks) and printed the aggregated callstacks to the attached file perf-histograms.out.

      The summary without codegen. Most of the time (28mins) is spent in the hash joins:

      Operator              #Hosts   Avg Time   Max Time  #Rows  Est. #Rows   Peak Mem  Est. Peak Mem  Detail                         
      --------------------------------------------------------------------------------------------------------------------------------
      07:TOP-N                   1    1s515ms    1s515ms      5           5   16.02 MB        -1.00 B                                 
      06:ANALYTIC                1   10s945ms   10s945ms  5.33M          -1   32.25 MB        -1.00 B                                 
      10:MERGING-EXCHANGE        1  639.447ms  639.447ms  5.33M          -1          0        -1.00 B  UNPARTITIONED                  
      05:SORT                    1      2m23s      2m23s  5.33M          -1  248.17 MB              0                                 
      04:HASH JOIN               1     28m37s     28m37s  5.33M          -1    4.07 MB        2.00 GB  INNER JOIN, BROADCAST          
      |--09:EXCHANGE             1    1.952ms    1.952ms  7.30K          -1          0              0  BROADCAST                      
      |  02:SCAN HDFS            1  584.840ms  584.840ms  7.30K          -1  191.00 KB       48.00 MB  functional_parquet.alltypes t3 
      03:HASH JOIN               1    1s126ms    1s126ms  3.99M          -1    4.10 MB        2.00 GB  INNER JOIN, BROADCAST          
      |--08:EXCHANGE             1    1.811ms    1.811ms  7.30K          -1          0              0  BROADCAST                      
      |  01:SCAN HDFS            1  585.767ms  585.767ms  7.30K          -1  212.00 KB       48.00 MB  functional_parquet.alltypes t2 
      00:SCAN HDFS               1   14.134ms   14.134ms  7.30K          -1  924.00 KB       48.00 MB  functional_parquet.alltypes t1 
      

      Attachments

        1. perf-histograms.out
          2.10 MB
          Matthew Jacobs
        2. slow_query_no_codegen.txt
          32 kB
          Matthew Jacobs

        Activity

          People

            Unassigned Unassigned
            mjacobs Matthew Jacobs
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: