Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.12.0
-
ghx-label-8
Description
Order by expressions in Analytical functions are not materialized and cause queries to run much slower.
The rewrite for the query below is 20x faster, profiles attached.
Repro
select * FROM ( SELECT o.*, ROW_NUMBER() OVER(ORDER BY evt_ts DESC) AS rn FROM ( SELECT l_orderkey,l_partkey,l_linenumber,l_quantity, cast (l_shipdate as string) evt_ts FROM lineitem WHERE l_shipdate BETWEEN '1992-01-01 00:00:00' AND '1992-01-15 00:00:00' ) o ) r WHERE rn BETWEEN 1 AND 101 ORDER BY rn;
Workaround
select * FROM ( SELECT o.*, ROW_NUMBER() OVER(ORDER BY evt_ts DESC) AS rn FROM ( SELECT l_orderkey,l_partkey,l_linenumber,l_quantity, cast (l_shipdate as string) evt_ts FROM lineitem WHERE l_shipdate BETWEEN '1992-01-01 00:00:00' AND '1992-01-15 00:00:00' union all SELECT l_orderkey,l_partkey,l_linenumber,l_quantity, cast (l_shipdate as string) evt_ts FROM lineitem limit 0 ) o ) r WHERE rn BETWEEN 1 AND 101 ORDER BY rn;