[IMPALA-5577] Memory leak when looping a select, CTAS, and daemon crashes - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Won't Fix
Affects Version/s: Impala 2.9.0
Fix Version/s: None
Component/s: Backend
Labels:
None

Target Version:

Impala 2.11.0
Epic Color:
ghx-label-1

Description

While trying to reproduce ~~IMPALA-5558~~ I hit a "Memory limit exceeded" error.

I0625 04:53:07.747248  7944 status.cc:55] Memory limit exceeded: Error occurred on backend lv-desktop:22000 by fragment 6c459f163a70ea25:7765587500000002
Memory left in process limit: -350618.00 B
Process: memory limit exceeded. Limit=8.35 GB Total=8.35 GB Peak=8.38 GB
  RequestPool=default-pool: Total=85.35 MB Peak=181.51 MB
    Query(6c459f163a70ea25:7765587500000000): Total=85.35 MB Peak=85.61 MB
      Fragment 6c459f163a70ea25:7765587500000002: Total=85.35 MB Peak=85.61 MB
        HDFS_SCAN_NODE (id=0): Total=85.29 MB Peak=85.55 MB
        HdfsTableSink: Total=50.00 KB Peak=50.00 KB
        CodeGen: Total=415.00 B Peak=49.00 KB
      Block Manager: Limit=6.68 GB Total=0 Peak=0
  Untracked Memory: Total=8.27 GB
    @     0x7fe3f28adf1e  impala::Status::Status()
    @     0x7fe3f28adc5e  impala::Status::MemLimitExceeded()
    @     0x7fe3f2035a4a  impala::MemTracker::MemLimitExceeded()
    @     0x7fe3f2076b24  impala::RuntimeState::SetMemLimitExceeded()
    @     0x7fe3f2076e4a  impala::RuntimeState::CheckQueryState()
    @     0x7fe3f170ccaf  impala::ExecNode::QueryMaintenance()
    @     0x7fe3f174e577  impala::HdfsScanNode::GetNextInternal()
    @     0x7fe3f174e222  impala::HdfsScanNode::GetNext()
    @     0x7fe3f201ffde  impala::FragmentInstanceState::ExecInternal()
    @     0x7fe3f201d949  impala::FragmentInstanceState::Exec()
    @     0x7fe3f20472f6  impala::QueryState::ExecFInstance()
    @     0x7fe3f2054efa  boost::_mfi::mf1<>::operator()()
    @     0x7fe3f20542dd  boost::_bi::list2<>::operator()<>()
    @     0x7fe3f2053935  boost::_bi::bind_t<>::operator()()
    @     0x7fe3f2052596  boost::detail::function::void_function_obj_invoker0<>::invoke()
    @     0x7fe3f2519a3c  boost::function0<>::operator()()
    @     0x7fe3f25170cf  impala::Thread::SuperviseThread()
    @     0x7fe3f25205c8  boost::_bi::list4<>::operator()<>()
    @     0x7fe3f252050b  boost::_bi::bind_t<>::operator()()
    @     0x7fe3f25204ce  boost::detail::thread_data<>::run()
    @           0x87d2aa  thread_proxy
    @     0x7fe3ec52e184  start_thread
    @     0x7fe3ec25bbed  clone
I0625 04:53:07.747314  7944 runtime-state.cc:194] Error from query 6c459f163a70ea25:7765587500000000: Memory limit exceeded: Error occurred on backend lv-desktop:22000 by fragment 6c459f163a70ea25:7765587500000002
Memory left in process limit: -350618.00 B
Process: memory limit exceeded. Limit=8.35 GB Total=8.35 GB Peak=8.38 GB
  RequestPool=default-pool: Total=85.35 MB Peak=181.51 MB
    Query(6c459f163a70ea25:7765587500000000): Total=85.35 MB Peak=85.61 MB
      Fragment 6c459f163a70ea25:7765587500000002: Total=85.35 MB Peak=85.61 MB
        HDFS_SCAN_NODE (id=0): Total=85.29 MB Peak=85.55 MB
        HdfsTableSink: Total=50.00 KB Peak=50.00 KB
        CodeGen: Total=415.00 B Peak=49.00 KB
      Block Manager: Limit=6.68 GB Total=0 Peak=0
  Untracked Memory: Total=8.27 GB

My test does the following in a loop:

Run 4 select queries to warm up the client caches
Restart the second node of the local minicluster
Run a CTAS query and make sure it succeeded

The script to run this is here: https://gist.github.com/lekv/0093bf133d2c61267af0f910348da124

The CTAS query in the last step hit the memory exceeded exception and it looks like there is a leak somewhere.

edit:
It uses warmup.sh to create connections in the client cache.

#!/bin/bash
impala-shell.sh -i localhost:21000 -f union50.sql
impala-shell.sh -i localhost:21000 -f union50.sql
impala-shell.sh -i localhost:21001 -f union50.sql
impala-shell.sh -i localhost:21001 -f union50.sql

union50.sql can be found here and unions the same query 50 times.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

pprof-growth.txt
26/Jun/17 23:18
23 kB
Lars Volker
pprof-growth-off.pdf
27/Jun/17 00:00
14 kB
Lars Volker

Issue Links

Is contained by

IMPALA-2567 KRPC milestone 1

Resolved

Activity

People

Assignee:: Sailesh Mukil

Reporter:: Lars Volker

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 25/Jun/17 17:47

Updated:: 11/Sep/17 20:44

Resolved:: 11/Sep/17 20:44