Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.1
-
None
-
None
Description
I was running a simple stress test for spilling (repeatedly running the test_mem_scaling.py for 4hours) with the latest code (a298a37) and at some point one of the Impalads crashed with the following. I don't think we have seen that before.
hs_err:
Stack: [0x00007f381db5b000,0x00007f381e35c000], sp=0x00007f381e35a1c0, free space=8188k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) C [impalad+0xaf6499] impala::ScopedTimer<impala::MonotonicStopWatch>::UpdateCounter()+0x49 C [impalad+0xaf1c0e] impala::ScopedTimer<impala::MonotonicStopWatch>::~ScopedTimer()+0x24 C [impalad+0x106208e] impala::DataStreamRecvr::SenderQueue::AddBatch(impala::TRowBatch const&)+0x384 C [impalad+0x10638bc] impala::DataStreamRecvr::AddBatch(impala::TRowBatch const&, int)+0x5c C [impalad+0x105d35e] impala::DataStreamMgr::AddData(impala::TUniqueId const&, int, impala::TRowBatch const&, int)+0x1b8 C [impalad+0xc25f17] impala::ImpalaServer::TransmitData(impala::TTransmitDataResult&, impala::TTransmitDataParams const&)+0x207 C [impalad+0xc3fa89] impala::ImpalaInternalService::TransmitData(impala::TTransmitDataResult&, impala::TTransmitDataParams const&)+0x37 C [impalad+0xefb056] impala::ImpalaInternalServiceProcessor::process_TransmitData(int, apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, void*)+0x21c C [impalad+0xef97c7] impala::ImpalaInternalServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, std::string const&, int, void*)+0x2b1 C [impalad+0xc2e8e6] apache::thrift::TDispatchProcessor::process(boost::shared_ptr<apache::thrift::protocol::TProtocol>, boost::shared_ptr<apache::thrift::protocol::TProtocol>, void*)+0xce
gdb:
#0 0x00007f389a7810d5 in __GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007f389a78483b in __GI_abort () at abort.c:91 #2 0x00007f38999c7535 in os::abort(bool) () from /usr/lib/jvm/jdk1.7.0_55/jre/lib/amd64/server/libjvm.so #3 0x00007f3899b46457 in VMError::report_and_die() () from /usr/lib/jvm/jdk1.7.0_55/jre/lib/amd64/server/libjvm.so #4 0x00007f38999cbebf in JVM_handle_linux_signal () from /usr/lib/jvm/jdk1.7.0_55/jre/lib/amd64/server/libjvm.so #5 <signal handler called> #6 0x0000000000ef6499 in impala::ScopedTimer<impala::MonotonicStopWatch>::UpdateCounter (this=0x7f381e35a260) at /SSD/ipandis/GITPROJECTS/Impala/be/src/util/runtime-profile.h:717 #7 0x0000000000ef1c0e in impala::ScopedTimer<impala::MonotonicStopWatch>::~ScopedTimer (this=0x7f381e35a260, __in_chrg=<optimized out>) at /SSD/ipandis/GITPROJECTS/Impala/be/src/util/runtime-profile.h:730 #8 0x000000000146208e in impala::DataStreamRecvr::SenderQueue::AddBatch (this=0x82e2a20, thrift_batch=...) at /SSD/ipandis/GITPROJECTS/Impala/be/src/runtime/data-stream-recvr.cc:173 #9 0x00000000014638bc in impala::DataStreamRecvr::AddBatch (this=0x46c76160, thrift_batch=..., sender_id=2) at /SSD/ipandis/GITPROJECTS/Impala/be/src/runtime/data-stream-recvr.cc:320 #10 0x000000000145d35e in impala::DataStreamMgr::AddData (this=0x7503050, fragment_instance_id=..., dest_node_id=5, thrift_batch=..., sender_id=2) at /SSD/ipandis/GITPROJECTS/Impala/be/src/runtime/data-stream-mgr.cc:103 #11 0x0000000001025f17 in impala::ImpalaServer::TransmitData (this=0x3973180, return_val=..., params=...) at /SSD/ipandis/GITPROJECTS/Impala/be/src/service/impala-server.cc:969 #12 0x000000000103fa89 in impala::ImpalaInternalService::TransmitData (this=0x5a52270, return_val=..., params=...) at /SSD/ipandis/GITPROJECTS/Impala/be/src/service/impala-internal-service.h:52 #13 0x00000000012fb056 in impala::ImpalaInternalServiceProcessor::process_TransmitData (this=0x6e36fc0, seqid=0, iprot=0x7fb56c0, oprot=0x7fb5680, callContext=0x83a4600) at /SSD/ipandis/GITPROJECTS/Impala/be/generated-sources/gen-cpp/ImpalaInternalService.cpp:1111 #14 0x00000000012f97c7 in impala::ImpalaInternalServiceProcessor::dispatchCall (this=0x6e36fc0, iprot=0x7fb56c0, oprot=0x7fb5680, fname="TransmitData", seqid=0, callContext=0x83a4600) at /SSD/ipandis/GITPROJECTS/Impala/be/generated-sources/gen-cpp/ImpalaInternalService.cpp:922 V#15 0x000000000102e8e6 in apache::thrift::TDispatchProcessor::process (this=0x6e36fc0, in=(boost::shared_ptr<apache::thrift::protocol::TProtocol>) (count 2, weak count 1) 0x7fb56c0, out=(boost::shared_ptr<apache::thrift::protocol::TProtocol>) (count 2, weak count 1) 0x7fb5680, connectionContext=0x83a4600) at /SSD/ipandis/GITPROJECTS/Impala/thirdparty/thrift-0.9.0/build/include/thrift/TDispatchProcessor.h:121 #16 0x000000000202d3cd in apache::thrift::server::TThreadedServer::Task::run (this=0xa3da5a0) at src/thrift/server/TThreadedServer.cpp:70 #17 0x0000000000f7460f in impala::ThriftThread::RunRunnable (this=0x825b7c0, runnable=(boost::shared_ptr<apache::thrift::concurrency::Runnable>) (count 4, weak count 1) 0xa3da5a0, promise=0x7f3870f01620) at /SSD/ipandis/GITPROJECTS/Impala/be/src/rpc/thrift-thread.cc:61 #18 0x0000000000f75c74 in boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<unsigned long>*>::operator() (this=0x7dd0600, p=0x825b7c0, a1=(boost::shared_ptr<apache::thrift::concurrency::Runnable>) (count 4, weak count 1) 0xa3da5a0, a2=0x7f3870f01620) at /usr/include/boost/bind/mem_fn_template.hpp:280 #19 0x0000000000f75adc in boost::_bi::list3<boost::_bi::value<impala::ThriftThread*>, boost::_bi::value<boost::shared_ptr<apache::thrift::concurrency::Runnable> >, boost::_bi::value<impala::Promise<unsigned long>*> >::operator()<boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<unsigned long>*>, boost::_bi::list0> (this=0x7dd0610, f=..., a=...) at /usr/include/boost/bind/bind.hpp:392 #20 0x0000000000f7586f in boost::_bi::bind_t<void, boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<unsigned long>*>, boost::_bi::list3<boost::_bi::value<impala::ThriftThread*>, boost::_bi::value<boost::shared_ptr<apache::thrift::concurrency::Runnable> >, boost::_bi::value<impala::Promise<unsigned long>*> > >::operator() (this=0x7dd0600) at /usr/include/boost/bind/bind_template.hpp:20 #21 0x0000000000f75792 in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<unsigned long>*>, boost::_bi::list3<boost::_bi::value<impala::ThriftThread*>, boost::_bi::value<boost::shared_ptr<apache::thrift::concurrency::Runnable> >, boost::_bi::value<impala::Promise<unsigned long>*> > >, void>::invoke (function_obj_ptr=...) at /usr/include/boost/function/function_template.hpp:153 #22 0x0000000000f991b2 in boost::function0<void>::operator() (this=0x7f381e35adb0) at /usr/include/boost/function/function_template.hpp:1013 #23 0x00000000011d7f16 in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) (name="backend-2", category="thrift-server", functor=..., thread_started=0x7f3870f01460) at /SSD/ipandis/GITPROJECTS/Impala/be/src/util/thread.cc:311 #24 0x00000000011dfba2 in boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> >::operator()<void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list0>(boost::_bi::type<void>, void (*&)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list0&, int) (this=0x8393630, f=@0x8393628: 0x11d7c38 <impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*)>, a=...) at /usr/include/boost/bind/bind.hpp:457 #25 0x00000000011dfaeb in boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> > >::operator()() (this=0x8393628) at /usr/include/boost/bind/bind_template.hpp:20 #26 0x00000000011dfa7e in boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> > > >::run() (this=0x83934a0) at /usr/include/boost/thread/detail/thread.hpp:61 #27 0x00007f389cef5ce9 in thread_proxy () from /usr/lib/libboost_thread.so.1.46.1 #28 0x00007f389ccd3e9a in start_thread (arg=0x7f381e35b700) at pthread_create.c:308 #29 0x00007f389a83f2ed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #30 0x0000000000000000 in ?? ()
I kept the coredump.