Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Cannot Reproduce
-
Impala 2.0
-
None
Description
The following query on a relatively small input takes ~70s with codegen and ~35 minutes without codegen, which seems unreasonably long. We should make sure we're not doing anything crazy in the non-codegen eval path.
Executing with codegen:
use functional; select sum(t1.int_col * t1.id) OVER ( ORDER BY t1.int_col * t1.id ASC, t3.year + t3.tinyint_col ASC ROWS BETWEEN 66 PRECEDING AND 21 FOLLOWING) AS int_col_1 FROM alltypes t1 INNER JOIN alltypes t2 ON t2.month = t1.smallint_col INNER JOIN alltypes t3 ON t3.bigint_col = t2.bigint_col WHERE t2.id <= t3.smallint_col + t2.tinyint_col order by 1 limit 5; Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail ------------------------------------------------------------------------------------------------------------------------- 07:TOP-N 1 1s578ms 1s578ms 5 5 16.02 MB -1.00 B 06:ANALYTIC 1 12s179ms 12s179ms 5.33M 3.89B 32.25 MB -1.00 B 11:MERGING-EXCHANGE 1 615.834ms 615.834ms 5.33M 3.89B 0 -1.00 B UNPARTITIONED 05:SORT 3 9s857ms 29s572ms 5.33M 3.89B 256.03 MB 1.09 GB 04:HASH JOIN 3 26s717ms 27s350ms 5.33M 3.89B 4.07 MB 117.63 KB INNER JOIN, BROADCAST |--10:EXCHANGE 3 1.682ms 1.841ms 7.30K 7.30K 0 0 BROADCAST | 02:SCAN HDFS 3 238.331ms 270.319ms 7.30K 7.30K 329.00 KB 160.00 MB functional.alltypes t3 03:HASH JOIN 3 34.448ms 41.592ms 3.99M 5.33M 4.10 MB 26.14 KB INNER JOIN, PARTITIONED |--09:EXCHANGE 3 392.297us 497.645us 7.30K 7.30K 0 0 HASH(t1.smallint_col) | 00:SCAN HDFS 3 232.237ms 240.65ms 7.30K 7.30K 329.00 KB 160.00 MB functional.alltypes t1 08:EXCHANGE 3 347.748us 381.88us 7.30K 7.30K 0 0 HASH(t2.month) 01:SCAN HDFS 3 191.334ms 237.188ms 7.30K 7.30K 385.00 KB 160.00 MB functional.alltypes t2
The query profile when running without codegen is attached. I also used perf (linux profiling tool) to collect a profile (sampled stacks) and printed the aggregated callstacks to the attached file perf-histograms.out.
The summary without codegen. Most of the time (28mins) is spent in the hash joins:
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail -------------------------------------------------------------------------------------------------------------------------------- 07:TOP-N 1 1s515ms 1s515ms 5 5 16.02 MB -1.00 B 06:ANALYTIC 1 10s945ms 10s945ms 5.33M -1 32.25 MB -1.00 B 10:MERGING-EXCHANGE 1 639.447ms 639.447ms 5.33M -1 0 -1.00 B UNPARTITIONED 05:SORT 1 2m23s 2m23s 5.33M -1 248.17 MB 0 04:HASH JOIN 1 28m37s 28m37s 5.33M -1 4.07 MB 2.00 GB INNER JOIN, BROADCAST |--09:EXCHANGE 1 1.952ms 1.952ms 7.30K -1 0 0 BROADCAST | 02:SCAN HDFS 1 584.840ms 584.840ms 7.30K -1 191.00 KB 48.00 MB functional_parquet.alltypes t3 03:HASH JOIN 1 1s126ms 1s126ms 3.99M -1 4.10 MB 2.00 GB INNER JOIN, BROADCAST |--08:EXCHANGE 1 1.811ms 1.811ms 7.30K -1 0 0 BROADCAST | 01:SCAN HDFS 1 585.767ms 585.767ms 7.30K -1 212.00 KB 48.00 MB functional_parquet.alltypes t2 00:SCAN HDFS 1 14.134ms 14.134ms 7.30K -1 924.00 KB 48.00 MB functional_parquet.alltypes t1