Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.6.0
-
None
Description
Hanging looks like this:
#0 0x000000398340e82d in read () from 05r/lib64/libpthread.so.0 #1 0x00000039870dea71 in ?? () from 05r/usr/lib64/libcrypto.so.10 #2 0x00000039870dcdc9 in BIO_read () from 05r/usr/lib64/libcrypto.so.10 #3 0x0000003989431873 in ssl23_read_bytes () from 05r/usr/lib64/libssl.so.10 #4 0x000000398942fe63 in ssl23_get_client_hello () from 05r/usr/lib64/libssl.so.10 #5 0x00000039894302f3 in ssl23_accept () from 05r/usr/lib64/libssl.so.10 #6 0x00000000015ee4bc in apache::thrift::transport::TSSLSocket::checkHandshake (this=0xf317b00) at src/thrift/transport/TSSLSocket.cpp:228 #7 0x00000000015ee820 in apache::thrift::transport::TSSLSocket::read (this=0xf317b00, buf=0x7f8a9ea750a0 "@S\247\236\212\177", len=5) at src/thrift/transport/TSSLSocket.cpp:164 #8 0x00000000015ebc4f in apache::thrift::transport::readAll<apache::thrift::transport::TSocket> (trans=..., buf=0x7f8a9ea750a0 "@S\247\236\212\177", len=5) at src/thrift/transport/TTransport.h:39 #9 0x0000000000a80228 in apache::thrift::transport::TTransport::readAll (len=5, buf=0x7f8a9ea750a0 "@S\247\236\212\177", this=<optimized out>) at /usr/src/debug/impala-2.3.0-cdh5.5.2/thirdparty/thrift-0.9.0/build/include/thrift/transport/TTransport.h:126 #10 apache::thrift::transport::TSaslTransport::receiveSaslMessage (this=0xb6a0770, status=0x7f8a9ea752e4, length=0x7f8a9ea752e8) at /usr/src/debug/impala-2.3.0-cdh5.5.2/be/src/transport/TSaslTransport.cpp:237 #11 0x0000000000a7dc84 in apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage (this=0xb6a0770) at /usr/src/debug/impala-2.3.0-cdh5.5.2/be/src/transport/TSaslServerTransport.cpp:80 #12 0x0000000000a8075e in apache::thrift::transport::TSaslTransport::open (this=0xb6a0770) at /usr/src/debug/impala-2.3.0-cdh5.5.2/be/src/transport/TSaslTransport.cpp:95 #13 0x0000000000a7e9c1 in apache::thrift::transport::TSaslServerTransport::Factory::getTransport (this=0xd0edcb0, trans=...) at /usr/src/debug/impala-2.3.0-cdh5.5.2/be/src/transport/TSaslServerTransport.cpp:145 #14 0x00000000015f6f78 in apache::thrift::server::TThreadedServer::serve (this=0xc181420) at src/thrift/server/TThreadedServer.cpp:162 #15 0x000000000095149c in impala::ThriftServer::ThriftServerEventProcessor::Supervise (this=<optimized out>) at /usr/src/debug/impala-2.3.0-cdh5.5.2/be/src/rpc/thrift-server.cc:173 #16 0x0000000000ae0faa in boost::function0<void>::operator() (this=<optimized out>) at /opt/toolchain/boost-pic-1.55.0/include/boost/function/function_template.hpp:767 #17 impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) (name=..., category=..., functor=..., thread_started=0x7fff9af4ca60) at /usr/src/debug/impala-2.3.0-cdh5.5.2/be/src/util/thread.cc:314 #18 0x0000000000ae3250 in boost::_bi::list4<boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<boost::function<void()> >, boost::_bi::value<impala::Promise<long int>*> >::operator()<void (*)(const std::string&, const std::string&, impala::Thread::ThreadFunctor, impala::Promise<long int>*), boost::_bi::list0> (a=..., f=@0xc3747b8: 0xae0df0 <impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*)>, this=0xc3747c0) at /opt/toolchain/boost-pic-1.55.0/include/boost/bind/bind.hpp:457 #19 boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> > >::operator()() (this=0xc3747b8) at /opt/toolchain/boost-pic-1.55.0/include/boost/bind/bind_template.hpp:20 #20 boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> > > >::run() (this=0xc374600) at /opt/toolchain/boost-pic-1.55.0/include/boost/thread/detail/thread.hpp:117 #21 0x0000000000d28c43 in ?? () #22 0x0000003983407aa1 in start_thread () from 05r/lib64/libpthread.so.0 #23 0x00000039830e893d in clone () from 05r/lib64/libc.so.6
This is very very bad that the whole threaded server thread will hang because it never gets a chance to dispatch the new serving thread by thread->start();
This impalad becomes zombie..
From http://github.mtv.cloudera.com/CDH/Impala/blob/cdh5-trunk/be/src/runtime/client-cache.cc#L106-L113
we should probably set socket timeout before OpenWithRetry().