Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 2.6.0
Description
Runtime filters can be tens of MB or more, and incasting all filters from all shuffle joins to the coordinator can put a lot of memory pressure on that node. To alleviate this we should consider spreading out the aggregation operation across the cluster, so that a different node aggregates each runtime filter.
This still restricts aggregation to #runtime-filters nodes, which will usually be less than the cluster size. If we want to smooth that out further we could use tree-based aggregation, but let's measure the benefits of simply distributing the aggregation work first.
Attachments
Issue Links
- causes
-
IMPALA-13040 SIGSEGV in QueryState::UpdateFilterFromRemote
- Resolved
- is depended upon by
-
IMPALA-6412 Memory issues with processing of incoming global runtime filters on coordinator
- Resolved
- is related to
-
IMPALA-8687 --rpc_use_loopback may not work for runtime filter RPCs
- Resolved
-
IMPALA-7486 Admit less memory on dedicated coordinator for admission control purposes
- Resolved
-
IMPALA-4457 Coordinator sends out runtime filters serially
- Resolved
- relates to
-
IMPALA-6144 Coordinator threads that publish RuntimeFilters continue to run after query failure/cancellation
- Resolved
-
IMPALA-3701 Evaluate compressing Runtime filters to save coordinator network bandwidth
- Resolved