Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.5.0
Description
There are several values in HashTable or HashTableCtx that are known at codegen time. Injecting them could significantly reduce the size and complexity of the codegened functions and eliminate some instructions in a hot loop. I did an initial experiment hardcoding some values based on the Agg. This yields an Impala build that can't execute hash joins, but is significantly faster on some aggs (25-30% for the lowndv targeted perf aggs). The patch I used is attached and a performance summary from a single node run is here:
Report Generated on 2016-03-07
Run Description: "Base: 9b4fdbc488738bf59d3056fcfa4e106ad597d7e1 vs Ref: fa024b449297cb8fbcae55ded55266a02a99f18f"
Cluster Name: UNKNOWN
Lab Run Info: UNKNOWN
Impala Version: impalad version 2.5.0-cdh5.7.0 RELEASE (2016-02-24)
Baseline Impala Version: impalad version 2.5.0-cdh5.7.0 RELEASE (2016-02-24)
+--------------------+-----------------------+---------+------------+------------+----------------+
| Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) |
+--------------------+-----------------------+---------+------------+------------+----------------+
| TARGETED-PERF(_20) | parquet / none / none | 29.48 | +1.27% | 13.01 | -6.11% |
+--------------------+-----------------------+---------+------------+------------+----------------+
+--------------------+--------------------------------------------------------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
| Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Num Clients | Iters |
+--------------------+--------------------------------------------------------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
| TARGETED-PERF(_20) | primitive_groupby_decimal_highndv | parquet / none / none | 15.57 | 14.50 | +7.39% | 9.81% | 1.63% | 1 | 5 |
| TARGETED-PERF(_20) | primitive_shuffle_join_one_to_many_string_with_groupby | parquet / none / none | 108.90 | 106.32 | +2.42% | 1.49% | 0.68% | 1 | 5 |
| TARGETED-PERF(_20) | primitive_groupby_bigint_pk | parquet / none / none | 44.20 | 43.93 | +0.61% | 0.87% | 0.88% | 1 | 5 |
| TARGETED-PERF(_20) | primitive_shuffle_join_union_all_with_groupby | parquet / none / none | 22.44 | 22.51 | -0.33% | 0.31% | 0.38% | 1 | 5 |
| TARGETED-PERF(_20) | primitive_groupby_bigint_highndv | parquet / none / none | 11.63 | 11.81 | -1.57% | 1.86% | 0.90% | 1 | 5 |
| TARGETED-PERF(_20) | primitive_groupby_decimal_lowndv.test | parquet / none / none | 1.87 | 2.42 | I -22.67% | 1.81% | 7.17% | 1 | 5 |
| TARGETED-PERF(_20) | primitive_groupby_bigint_lowndv | parquet / none / none | 1.73 | 2.26 | I -23.24% | 1.18% | 3.28% | 1 | 5 |
+--------------------+--------------------------------------------------------+-----------------------+--------+-------------+------------+-----------+----------------+-------------+-------+
(I) Improvement: TARGETED-PERF(_20) primitive_groupby_decimal_lowndv.test [parquet / none / none] (2.42s -> 1.87s [-22.67%])
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+---------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Rows | Est #Rows |
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+---------+-----------+
| 03:AGGREGATE | 5.60% | 102.06ms | 98.47ms | +3.65% | * 13.45% * | 123.97ms | 112.22ms | +10.48% | 1 | 0 | 1 |
| 01:AGGREGATE | 81.74% | 1.49s | 2.07s | -27.98% | 2.11% | 1.54s | 2.36s | -34.74% | 1 | 11 | 11 |
| 00:SCAN HDFS | 12.66% | 230.69ms | 195.33ms | +18.10% | 4.92% | 247.22ms | 203.59ms | +21.43% | 1 | 119.99M | 119.99M |
+--------------+------------+----------+----------+------------+------------+----------+----------+------------+--------+---------+-----------+
(I) Improvement: TARGETED-PERF(_20) primitive_groupby_bigint_lowndv [parquet / none / none] (2.26s -> 1.73s [-23.24%])
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+---------+-----------+
| Operator | % of Query | Avg | Base Avg | Delta(Avg) | StdDev(%) | Max | Base Max | Delta(Max) | #Hosts | #Rows | Est #Rows |
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+---------+-----------+
| 03:AGGREGATE | 5.65% | 94.34ms | 93.95ms | +0.41% | 4.99% | 102.10ms | 98.69ms | +3.45% | 1 | 0 | 1 |
| 01:AGGREGATE | 81.74% | 1.36s | 1.91s | -28.61% | 1.77% | 1.40s | 2.03s | -30.98% | 1 | 7 | 7 |
| 00:SCAN HDFS | 12.60% | 210.34ms | 196.91ms | +6.82% | 6.30% | 232.58ms | 198.16ms | +17.37% | 1 | 119.99M | 119.99M |
+--------------+------------+----------+----------+------------+-----------+----------+----------+------------+--------+---------+-----------+
Significant perf change detected