Details
Description
When the stress test is configured to overcommit memory by 75% on a secure cluster, it invariably encounters a series of these errors:
Couldn't open transport for vd1442.test:22000 (SSL_connect: Resource temporarily unavailable)
These errors are in the impalad log files:
I1031 12:44:34.102388 100580 status.cc:122] Couldn't open transport for vd1442.test:22000 (SSL_connect: Resource temporarily unavailable) @ 0x839b89 impala::Status::Status() @ 0xdd8ab5 impala::ThriftClientImpl::Open() @ 0xdd8dd0 impala::ThriftClientImpl::OpenWithRetry() @ 0xa10cda impala::ClientCacheHelper::CreateClient() @ 0xa112ab impala::ClientCacheHelper::GetClient() @ 0xdf3a33 impala::ClientConnection<>::ClientConnection() @ 0xdf8f7a impala::DataStreamSender::Channel::FlushAndSendEos() @ 0xdf967d impala::DataStreamSender::FlushFinal() @ 0xa4b4ed impala::FragmentInstanceState::ExecInternal() @ 0xa4e8a9 impala::FragmentInstanceState::Exec() @ 0xa2b028 impala::QueryState::ExecFInstance() @ 0xbce972 impala::Thread::SuperviseThread() @ 0xbcf0d4 boost::detail::thread_data<>::run() @ 0xe5b8aa thread_proxy @ 0x3ffcc07aa1 (unknown) @ 0x3ffc8e8bcd (unknown)
At the same time, host vd1442.test has a few of these log messages:
E1031 12:44:34.102897 75673 authentication.cc:159] SASL message (Kerberos (internal)): GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Clock skew too great) I1031 12:44:34.113767 75673 thrift-util.cc:123] TAcceptQueueServer: Caught TException: SASL(-13): authentication failure: GSSAPI Failure: gss_accept_sec_context I1031 12:44:34.145877 75673 thrift-util.cc:123] TAcceptQueueServer: Caught TException: SSL_accept: Broken pipe