Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.3.0
-
None
Description
I ran into a memory limit running TPC-H Q18 100-scale text with a 8000mb memory limit.
The query was able to complete with a much lower memory limit: 1800mb.
An initial look suggests that the join nodes temporarily reserved most or all of the blocks
Probably the conditions needed for this to happen are that other nodes have partitions large enough that they consume almost all of the available blocks just with a single partition per node pinned. If these partitions are pinned before another node gets its initial reservation, the observed behaviour could result.
use tpch100; set mem_limit=8000mb; select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey in ( select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 300 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate limit 100;
Not enough memory to get the minimum required buffers for aggregation with id=14.
I1011 13:36:17.719748 31287 plan-fragment-executor.cc:303] Open(): instance_id=4a434673989dd50b:78d43f783784dcb2 I1011 13:41:20.710088 31190 status.cc:45] Memory limit exceeded @ 0x7efd80e386eb impala::Status::Status() @ 0x7efd80e38405 impala::Status::MemLimitExceeded() @ 0x7efd80a9f110 impala::PartitionedAggregationNode::Partition::Spill() @ 0x7efd80aa209d impala::PartitionedAggregationNode::SpillPartition() @ 0x7efcf78f3ab6 (unknown) I1011 13:41:22.637529 31190 data-stream-mgr.cc:128] DeregisterRecvr(): fragment_instance_id=4a434673989dd50b:78d43f783784dca6, node=13 I1011 13:41:22.637567 31190 data-stream-recvr.cc:233] cancelled stream: fragment_instance_id_=4a434673989dd50b:78d43f783784dca6 node_id=13 I1011 13:41:22.703086 31139 status.cc:45] Memory limit exceeded @ 0x7efd80e386eb impala::Status::Status() @ 0x7efd80e38405 impala::Status::MemLimitExceeded() @ 0x7efd7ef87939 impala::RuntimeState::SetMemLimitExceeded() @ 0x7efd7ef661f3 impala::PlanFragmentExecutor::UpdateStatus() @ 0x7efd7ef64193 impala::PlanFragmentExecutor::Open() @ 0x7efd7e63bcc9 impala::FragmentMgr::FragmentExecState::Exec() @ 0x7efd7e660725 impala::FragmentMgr::FragmentExecThread() @ 0x7efd7e669102 boost::_mfi::mf1<>::operator()() @ 0x7efd7e668e73 boost::_bi::list2<>::operator()<>() @ 0x7efd7e668514 boost::_bi::bind_t<>::operator()() @ 0x7efd7e667a11 boost::detail::function::void_function_obj_invoker0<>::invoke() @ 0x7efd7ef506c7 boost::function0<>::operator()() @ 0x7efd7d6a53c9 impala::Thread::SuperviseThread() @ 0x7efd7d6aec97 boost::_bi::list4<>::operator()<>() @ 0x7efd7d6aebb8 boost::_bi::bind_t<>::operator()() @ 0x7efd7d6aeb6c boost::detail::thread_data<>::run() @ 0x7efd7cab609a (unknown) @ 0x7efd7bf586aa start_thread @ 0x7efd7a11eeed (unknown)
I1011 13:41:22.703302 31139 runtime-state.cc:229] Error from query 4a434673989dd50b:78d43f783784dca1: Memory Limit Exceeded Query(4a434673989dd50b:78d43f783784dca1) Limit: Limit=7.81 GB Consumption=5.89 GB Fragment 4a434673989dd50b:78d43f783784dca3: Consumption=3.26 MB SORT_NODE (id=9): Consumption=4.00 KB AGGREGATION_NODE (id=16): Consumption=3.25 MB EXCHANGE_NODE (id=15): Consumption=0 DataStreamRecvr: Consumption=0 DataStreamSender: Consumption=1.59 KB Block Manager: Limit=6.25 GB Consumption=5.75 GB Fragment 4a434673989dd50b:78d43f783784dca6: Consumption=4.07 GB AGGREGATION_NODE (id=8): Consumption=3.25 MB HASH_JOIN_NODE (id=7): Consumption=24.00 KB HASH_JOIN_NODE (id=6): Consumption=1.26 GB HASH_JOIN_NODE (id=5): Consumption=2.78 GB EXCHANGE_NODE (id=10): Consumption=0 DataStreamRecvr: Consumption=28.83 MB EXCHANGE_NODE (id=11): Consumption=0 DataStreamRecvr: Consumption=0 EXCHANGE_NODE (id=12): Consumption=0 DataStreamRecvr: Consumption=0 AGGREGATION_NODE (id=14): Consumption=0 EXCHANGE_NODE (id=13): Consumption=0 DataStreamSender: Consumption=4.78 KB Fragment 4a434673989dd50b:78d43f783784dcab: Consumption=18.23 MB AGGREGATION_NODE (id=4): Consumption=18.19 MB HDFS_SCAN_NODE (id=3): Consumption=0 DataStreamSender: Consumption=40.00 KB Fragment 4a434673989dd50b:78d43f783784dcb2: Consumption=74.09 MB HDFS_SCAN_NODE (id=2): Consumption=73.98 MB DataStreamSender: Consumption=75.98 KB
Workaround
The problem only occurs for specific memory limit values: increasing or decreasing the memory limit will avoid the issue in most cases.