Details
Description
After switching tests from running in MANUAL_FLUSH mode into AUTO_FLUSH_BACKGROUND mode, the debug assert started to fire when running the all_types-itest built in debug configuration and KUDU_ALLOW_SLOW_TESTS=1. It's 100% reproducible: just run
KUDU_ALLOW_SLOW_TESTS=1 ./bin/all_types-itest 2>/tmp/all_types.log
The following stacktrace is reported by the test:
F0920 12:26:24.726744 13286 consensus_queue.cc:401] Check failed: request->ByteSize() <= FLAGS_consensus_max_batch_size_bytes (1168286 vs. 1048576) *** Check failure stack trace: *** @ 0x7f74c9f42294 google::LogMessage::SendToLog() @ 0x7f74c9f42790 google::LogMessage::Flush() @ 0x7f74c9f463c2 google::LogMessageFatal::~LogMessageFatal() @ 0x7f74cf109918 kudu::consensus::PeerMessageQueue::RequestForPeer() @ 0x7f74cf0fc302 kudu::consensus::Peer::SendNextRequest() @ 0x7f74cf0fdfaf kudu::consensus::Peer::DoProcessResponse() @ 0x7f74cf1050a9 kudu::internal::RunnableAdapter<>::Run() @ 0x7f74cf10502c kudu::internal::InvokeHelper<>::MakeItSo() @ 0x7f74cf104ffa kudu::internal::Invoker<>::Run() @ 0x7f74cf1284ce kudu::Callback<>::Run() @ 0x7f74cf12e7b9 boost::_mfi::cmf0<>::operator()() @ 0x7f74cf12e720 boost::_bi::list1<>::operator()<>() @ 0x7f74cf12e6ca boost::_bi::bind_t<>::operator()() @ 0x7f74cf12e430 boost::detail::function::void_function_obj_invoker0<>::invoke() @ 0x7f74ce164b98 boost::function0<>::operator()() @ 0x7f74cab47589 kudu::FunctionRunnable::Run() @ 0x7f74cab45994 kudu::ThreadPool::DispatchThread() @ 0x7f74cab49cc9 boost::_mfi::mf1<>::operator()() @ 0x7f74cab49c27 boost::_bi::list2<>::operator()<>() @ 0x7f74cab49baa boost::_bi::bind_t<>::operator()() @ 0x7f74cab49930 boost::detail::function::void_function_obj_invoker0<>::invoke() @ 0x7f74ce164b98 boost::function0<>::operator()() @ 0x7f74cab3ab30 kudu::Thread::SuperviseThread() @ 0x3ae0e079d1 (unknown) @ 0x3ae0ae88fd (unknown) @ (nil) (unknown)
The patch which introduces the change from MANUAL_FLUSH to AUTO_FLUSH_BACKGROUND can be found at: https://gerrit.cloudera.org/#/c/4471/
Also, if trying to reproduce the issue, set the consensus_max_batch_size_bytes to 1048576 (i.e. 1MiB): as a temporary workaround to avoid triggering the debug assert, the all_types-itest set it to 2MiB.