Details
Description
Off late, We had witnessed different types of crashes in cluster. I don't see any similarities among crashes stack traces and also not able to reproduce. Below are the stack traces occurred in different daemons at different timings. I do have complete stack traces and let me know if it helps for further debugging.
1)
10-1-33-172
Nov 6 18:04
Thread 1 (Thread 0x7f018b423700 (LWP 96325)):
#0 0x00007f0aaaf5e207 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f0aaaf5fa38 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007f0aad280185 in os::abort(bool) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#3 0x00007f0aad422593 in VMError::report_and_die() () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#4 0x00007f0aad28568f in JVM_handle_linux_signal () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#5 0x00007f0aad27bbe3 in signalHandler(int, siginfo*, void*) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#6 <signal handler called>
No symbol table info available.
#7 0x00007f0a44ca3000 in ?? ()
No symbol table info available.
#8 0x0000000000dba5c1 in impala::HdfsParquetScanner::TransferScratchTuples(impala::RowBatch*) ()
No symbol table info available.
#9 0x0000000000dba924 in impala::HdfsParquetScanner::AssembleRows(std::vector<impala::ParquetColumnReader*, std::allocator<impala::ParquetColumnReader*> > const&, impala::RowBatch*, bool*) ()
No symbol table info available.
#10 0x0000000000dbf5f6 in impala::HdfsParquetScanner::GetNextInternal(impala::RowBatch*) ()
No symbol table info available.
#11 0x0000000000db9ba7 in impala::HdfsParquetScanner::ProcessSplit() ()
No symbol table info available.
#12 0x0000000000d835e6 in impala::HdfsScanNode::ProcessSplit(std::vector<impala::FilterContext, std::allocator<impala::FilterContext> > const&, impala::MemPool*, impala::io::ScanRange*) ()
No symbol table info available.
#13 0x0000000000d85115 in impala::HdfsScanNode::ScannerThread() ()
No symbol table info available.
#14 0x0000000000d16c83 in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) ()
No symbol table info available.
#15 0x0000000000d173c4 in boost::detail::thread_data<boost::_bi::bind_t<void, void (std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>> > > >::run() ()
No symbol table info available.
#16 0x000000000128fada in thread_proxy ()
No symbol table info available.
#17 0x00007f0aab2fcdd5 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#18 0x00007f0aab026b3d in clone () from /lib64/libc.so.6
No symbol table info available.
Nov 1 11:21
Thread 1 (Thread 0x7fb3bffe9700 (LWP 50334)):
#0 0x00007fc2959f0207 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007fc2959f18f8 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007fc297d12185 in os::abort(bool) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#3 0x00007fc297eb4593 in VMError::report_and_die() () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#4 0x00007fc297d1768f in JVM_handle_linux_signal () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#5 0x00007fc297d0dbe3 in signalHandler(int, siginfo*, void*) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#6 <signal handler called>
No symbol table info available.
#7 0x00007fc1dd9f0000 in ?? ()
No symbol table info available.
#8 0x0000000000fdd9f9 in impala::PartitionedAggregationNode::Open(impala::RuntimeState*) ()
No symbol table info available.
#9 0x0000000000b74d6d in impala::FragmentInstanceState::Open() ()
No symbol table info available.
#10 0x0000000000b763ab in impala::FragmentInstanceState::Exec() ()
No symbol table info available.
#11 0x0000000000b65b38 in impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) ()
No symbol table info available.
#12 0x0000000000d16c83 in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) ()
No symbol table info available.
#13 0x0000000000d173c4 in boost::detail::thread_data<boost::_bi::bind_t<void, void (std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>> > > >::run() ()
No symbol table info available.
#14 0x000000000128fada in thread_proxy ()
No symbol table info available.
#15 0x00007fc295d8edd5 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#16 0x00007fc295ab8b3d in clone () from /lib64/libc.so.6
No symbol table info available.
2)
10-1-42-100
Oct 28 07:15
(I am seeing this particular issue being discussed in https://issues.apache.org/jira/browse/IMPALA-7194)
Thread 1 (Thread 0x7f6ef14f6700 (LWP 15999)):
#0 0x00007f7c95316207 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f7c953178f8 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007f7c97638185 in os::abort(bool) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#3 0x00007f7c977da593 in VMError::report_and_die() () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#4 0x00007f7c9763d68f in JVM_handle_linux_signal () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#5 0x00007f7c97633be3 in signalHandler(int, siginfo*, void*) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#6 <signal handler called>
No symbol table info available.
#7 0x00007f7bdcdaa830 in ?? ()
No symbol table info available.
#8 0x0000000000fddd0f in impala::PartitionedAggregationNode::Open(impala::RuntimeState*) ()
No symbol table info available.
#9 0x0000000000b74d6d in impala::FragmentInstanceState::Open() ()
No symbol table info available.
#10 0x0000000000b763ab in impala::FragmentInstanceState::Exec() ()
No symbol table info available.
#11 0x0000000000b65b38 in impala::QueryState::ExecFInstance(impala::FragmentInstanceState*) ()
No symbol table info available.
#12 0x0000000000d16c83 in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) ()
No symbol table info available.
#13 0x0000000000d173c4 in boost::detail::thread_data<boost::_bi::bind_t<void, void (std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>> > > >::run() ()
No symbol table info available.
#14 0x000000000128fada in thread_proxy ()
No symbol table info available.
#15 0x00007f7c956b4dd5 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#16 0x00007f7c953deb3d in clone () from /lib64/libc.so.6
No symbol table info available.
3)
10-1-43-65
Nov 5 19:25
Thread 1 (Thread 0x7f484dfc3700 (LWP 5561)):
#0 0x00007f4c61734207 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f4c617358f8 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007f4c63a56185 in os::abort(bool) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#3 0x00007f4c63bf8593 in VMError::report_and_die() () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#4 0x00007f4c63a5b68f in JVM_handle_linux_signal () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#5 0x00007f4c63a51be3 in signalHandler(int, siginfo*, void*) () from /usr/java/jdk1.8.0_144/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#6 <signal handler called>
No symbol table info available.
#7 0x00007f4ba9146250 in ?? ()
No symbol table info available.
#8 0x0000000001023b52 in impala::PhjBuilder::Partition::BuildHashTable(bool*) ()
No symbol table info available.
#9 0x0000000001023dde in impala::PhjBuilder::BuildHashTablesAndPrepareProbeStreams() ()
No symbol table info available.
#10 0x0000000001024340 in impala::PhjBuilder::FlushFinal(impala::RuntimeState*) ()
No symbol table info available.
#11 0x000000000100da6d in impala::Status impala::BlockingJoinNode::SendBuildInputToSink<true>(impala::RuntimeState*, impala::DataSink*) ()
No symbol table info available.
#12 0x000000000100c2a0 in impala::BlockingJoinNode::ProcessBuildInputAsync(impala::RuntimeState*, impala::DataSink*, impala::Status*) ()
No symbol table info available.
#13 0x0000000000d16c83 in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) ()
No symbol table info available.
#14 0x0000000000d173c4 in boost::detail::thread_data<boost::_bi::bind_t<void, void (std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>> > > >::run() ()
No symbol table info available.
#15 0x000000000128fada in thread_proxy ()
No symbol table info available.
#16 0x00007f4c61ad2dd5 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#17 0x00007f4c617fcb3d in clone () from /lib64/libc.so.6
No symbol table info available