Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.12.0
-
ghx-label-7
Description
Exhaustive tests failed due to an impalad crash.
There is a core dump and a stacktrace:
#7 0x00000000018c3790 in impala::RuntimeProfile::EventSequence::Start (this=0x16a57bb0, start_time_ns=52490907124056) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive/repos/Impala/be/src/util/runtime-profile-counters.h:305 #8 0x00000000018bdbd4 in impala::FragmentInstanceState::Prepare (this=0x31d92000) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive/repos/Impala/be/src/runtime/fragment-instance-state.cc:135 #9 0x00000000018bd13f in impala::FragmentInstanceState::Exec (this=0x31d92000) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive/repos/Impala/be/src/runtime/fragment-instance-state.cc:73 #10 0x00000000018a5ec5 in impala::QueryState::ExecFInstance (this=0xb69d200, fis=0x31d92000) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive/repos/Impala/be/src/runtime/query-state.cc:388 #11 0x00000000018a475e in impala::QueryState::<lambda()>::operator()(void) const (__closure=0x7f7d00d54bc8) at /data/jenkins/workspace/impala-cdh5-trunk-exhaustive/repos/Impala/be/src/runtime/query-state.cc:328
It hit a DCHECK:
at java.nio.DirectByteBufferR.puF0127 03:26:11.638743 18765 runtime-profile-counters.h:305] Check failed: offset_ >= 0 (-1 vs. 0) *** Check failure stack trace: *** @ 0x3c2691d google::LogMessage::Fail() @ 0x3c281c2 google::LogMessage::SendToLog() @ 0x3c262f7 google::LogMessage::Flush() @ 0x3c298be google::LogMessageFatal::~LogMessageFatal() @ 0x18c3790 impala::RuntimeProfile::EventSequence::Start() @ 0x18bdbd4 impala::FragmentInstanceState::Prepare() @ 0x18bd13f impala::FragmentInstanceState::Exec() @ 0x18a5ec5 impala::QueryState::ExecFInstance() @ 0x18a475e _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv @ 0x18a6adf _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_15function_bufferE @ 0x17ef79e boost::function0<>::operator()() @ 0x1aecd8d impala::Thread::SuperviseThread() @ 0x1af5aa2 boost::_bi::list4<>::operator()<>() @ 0x1af59e5 boost::_bi::bind_t<>::operator()() @ 0x1af59a8 boost::detail::thread_data<>::run() @ 0x2db036a thread_proxy @ 0x3286407851 (unknown) @ 0x32860e894d (unknown)
Looks like the start time is somehow less than 'start_time_ns'
/// Starts the timer. All events will be recorded as if the timer had been started at /// 'start_time_ns', which must have been obtained by calling MonotonicStopWatch::Now(). void Start(int64_t start_time_ns) { offset_ = MonotonicStopWatch::Now() - start_time_ns; DCHECK_GE(offset_, 0); sw_.Start(); }
That code was added in this commit, but it doesn't really make sense why the clock would go backwards:
commit be0f4f9d21ac3c34f6ebf214e26686a008b57e24 Author: Lars Volker <lv@cloudera.com> Date: Tue Nov 28 15:08:29 2017 -0800 IMPALA-6190/6246: Add instances tab and event sequence
The time going backwards looks very similar to IMPALA-4631, which hasn't been resolved yet. Maybe we should loosen the DCHECK similarly...
Attachments
Issue Links
- is broken by
-
IMPALA-6190 Add a debug webpage to show fragment instances of a query
- Resolved
- relates to
-
IMPALA-4631 plan-fragment-executor.cc:518] Check failed: other_time <= total_time (25999394 vs. 25999393)
- Resolved