Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Impala 3.4.0
-
None
-
None
-
ghx-label-10
Description
When we join 2 subqueries each of which has its root fragment executing on the coordinator, currently the join is also executed on the coordinator. Here's an example query:
select count(*) from ( select rank() over (order by l_quantity) x from lineitem) dt1 inner join (select rank() over (order by l_shipdate) y from lineitem) dt2 on dt1.x = dt2.y
This can result in poor performance since the join inputs can be quite large. Ideally, we want to re-distribute both intermediate results after the rank() has been computed and do the join on executor nodes.
Another similar scenario is joining of subqueries which have ORDER BY LIMIT.
Attachments
Issue Links
- is related to
-
IMPALA-8483 Make coordinator fragment lighter-weight
- Open