[IMPALA-4038] RPC delays for single query can lead to ImpalaServer not making progress on any queries - ASF JIRA

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: Impala 2.7.0
Fix Version/s: Impala 2.8.0
Component/s: Distributed Exec
Labels:
- hang

Target Version:

Impala 2.8.0

Description

We observed a phenomenon where all Impala queries submitted to an Impala daemon got stuck in the CREATED state. One of the causes was an RPC not timing out (~~IMPALA-2799~~), but this was greatly exacerbated by the locking in ImpalaServer, which meant that no queries could make progress.

We saw threads in the following callstacks.

Either:
a) Waiting for query_exec_state_map_lock_
b) Holding query_exec_state_map_lock_ and waiting for QueryExecState::lock_
 38 __lll_lock_wait,_L_lock_854,pthread_mutex_lock,boost::mutex::lock(),impala::ImpalaServer::GetQueryExecState(impala::TUniqueId,impala::ImpalaServer::ReportExecStatus(impala::TReportExecStatusResult&,,impala::ImpalaInternalServiceProcessor::process_ReportExecStatus(int,,impala::ImpalaInternalServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*,,apache::thrift::TDispatchProcessor::process(boost::shared_ptr<apache::thrift::protocol::TProtocol>,,apache::thrift::server::TThreadedServer::Task::run(),impala::ThriftThread::RunRunnable(boost::shared_ptr<apache::thrift::concurrency::Runnable>,,boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void,,impala::Thread::SuperviseThread(std::string,boost::detail::thread_data<boost::_bi::bind_t<void,,??,start_thread,clone
 30 __lll_lock_wait,_L_lock_854,pthread_mutex_lock,boost::mutex::lock(),impala::ImpalaServer::GetQueryExecState(impala::TUniqueId,impala::ImpalaServer::QuerySummaryCallback(bool,,impala::Webserver::RenderUrlWithTemplate(std::map<std::string,,impala::Webserver::BeginRequestCallback(sq_connection*,,??,??,??,start_thread,clone

Waiting for Coordinator::lock_. Holds QueryExecState::lock_.
1 __lll_lock_wait,_L_lock_854,pthread_mutex_lock,boost::mutex::lock(),impala::Coordinator::Cancel(impala::Status,impala::ImpalaServer::QueryExecState::Cancel(impala::Status,impala::ImpalaServer::CancelInternal(impala::TUniqueId,impala::ImpalaServer::UnregisterQuery(impala::TUniqueId,impala::ImpalaServer::CancelQueryUrlCallback(std::map<std::string,,impala::Webserver::RenderUrlWithTemplate(std::map<std::string,,impala::Webserver::BeginRequestCallback(sq_connection*,,??,??,??,start_thread,clone

Waiting for the threads in ExecRemoteFragment(). Holds Coordinator::lock_.
1 pthread_cond_wait@@GLIBC_2.3.2,boost::condition_variable::wait(boost::unique_lock<boost::mutex>&),impala::Promise<bool>::Get(),impala::Coordinator::StartRemoteFragments(impala::QuerySchedule*),impala::Coordinator::Exec(impala::QuerySchedule&,,impala::ImpalaServer::QueryExecState::ExecQueryOrDmlRequest(impala::TQueryExecRequest,impala::ImpalaServer::QueryExecState::ExecDdlRequest(),impala::ImpalaServer::QueryExecState::Exec(impala::TExecRequest*),impala::ImpalaServer::ExecuteInternal(impala::TQueryCtx,impala::ImpalaServer::Execute(impala::TQueryCtx*,,impala::ImpalaServer::query(beeswax::QueryHandle&,,beeswax::BeeswaxServiceProcessor::process_query(int,,beeswax::BeeswaxServiceProcessor::dispatchCall(apache::thrift::protocol::TProtocol*,,apache::thrift::TDispatchProcessor::process(boost::shared_ptr<apache::thrift::protocol::TProtocol>,,apache::thrift::server::TThreadPoolServer::Task::run(),apache::thrift::concurrency::ThreadManager::Worker::run(),impala::ThriftThread::RunRunnable(boost::shared_ptr<apache::thrift::concurrency::Runnable>,,boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void,,impala::Thread::SuperviseThread(std::string,boost::detail::thread_data<boost::_bi::bind_t<void,,??,start_thread,clone

Waiting on network. Hold FragmentExecStatus::lock_. Previous thread is waiting for these threads to complete.
2 read,??,BIO_read,ssl23_read_bytes,ssl23_connect,apache::thrift::transport::TSSLSocket::checkHandshake(),apache::thrift::transport::TSSLSocket::write(unsigned,apache::thrift::transport::TBufferedTransport::flush(),apache::thrift::transport::TSaslTransport::sendSaslMessage(apache::thrift::transport::NegotiationStatus,,apache::thrift::transport::TSaslClientTransport::handleSaslStartMessage(),apache::thrift::transport::TSaslTransport::open(),impala::ThriftClientImpl::Open(),impala::ThriftClientImpl::OpenWithRetry(unsigned,impala::ClientCacheHelper::CreateClient(impala::TNetworkAddress,impala::ClientCacheHelper::GetClient(impala::TNetworkAddress,impala::ClientConnection<impala::ImpalaBackendClient>::ClientConnection(impala::ClientCache<impala::ImpalaBackendClient>*,,impala::Coordinator::ExecRemoteFragment(impala::FragmentExecParams,boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void,,impala::CallableThreadPool::Worker(int,,impala::ThreadPool<boost::function<void,impala::Thread::SuperviseThread(std::string,boost::detail::thread_data<boost::_bi::bind_t<void,,??,start_thread,clone

The basic problem is a mismatch between the durations that two sets of locks are meant to be held for. ImpalaServer::query_exec_state_map_lock_ and QueryExecState::lock_ must be acquired at various points for a query to make progress, so it's important that they're only held for short critical sections, . Coordinator::lock_ is held for long durations during RPCs in the coordinator, but holding that lock can't prevent other queries from making progress, so this is ok.

So it's bad if a thread tries to get Coordinator::lock_ while holding either of the other locks, since it may wait a long time while holding the other locks and block progress for a long time for all other queries in the system.

Attachments

Issue Links

relates to

IMPALA-4037 ChildQuery::Cancel() appears to violate lock ordering

Resolved

IMPALA-2799 Query hang up if remote impalad hosts shut down

Resolved

RPC delays for single query can lead to ImpalaServer not making progress on any queries

Details

Description

Attachments

Issue Links

Activity

People

Dates