Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4494

Crash in SimpleScheduler when restarting under load

    Details

      Description

      During startup the scheduler can start scheduling queries before the local node has been registered as a backend through the statestore. If a query runs with exec_at_coord, then it will fail to lookup the local backend in the BackendConfig and the scheduler will eventually crash.

      #0  0x00007f387dd5e5e5 in ?? () from sysroot/lib64/libc.so.6
      #1  0x00007f387dd5fdc5 in abort () from sysroot/lib64/libc.so.6
      #2  0x00007f387fcd5a55 in os::abort(bool) () from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #3  0x00007f387fe55f87 in VMError::report_and_die() () from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #4  0x00007f387fe5650e in crash_handler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #5  0x00007f387fcd4bf2 in os::Linux::chained_handler(int, siginfo*, void*) () from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #6  0x00007f387fcda8d6 in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #7  <signal handler called>
      #8  0x00007f387fccc511 in os::is_first_C_frame(frame*) () from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #9  0x00007f387fe5467d in VMError::report(outputStream*) () from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #10 0x00007f387fe55b8a in VMError::report_and_die() () from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #11 0x00007f387fcda96f in JVM_handle_linux_signal () from sysroot/usr/java/jdk1.7.0_67/jre/lib/amd64/server/libjvm.so
      #12 <signal handler called>
      #13 0x0000000000a93881 in impala::SimpleScheduler::AssignmentCtx::GetBackendRank(std::string const&) const ()
      #14 0x0000000000a989e3 in impala::SimpleScheduler::AssignmentCtx::RecordScanRangeAssignment(impala::TBackendDescriptor const&, int, std::vector<impala::TNetworkAddress, std::allocator<impala::TNetworkAddress> > const&, impala::TScanRangeLocations const&, boost::unordered::unordered_map<impala::TNetworkAddress, std::map<int, std::vector<impala::TScanRangeParams, std::allocator<impala::TScanRangeParams> >, std::less<int>, std::allocator<std::pair<int const, std::vector<impala::TScanRangeParams, std::allocator<impala::TScanRangeParams> > > > >, boost::hash<impala::TNetworkAddress>, std::equal_to<impala::TNetworkAddress>, std::allocator<std::pair<impala::TNetworkAddress const, std::map<int, std::vector<impala::TScanRangeParams, std::allocator<impala::TScanRangeParams> >, std::less<int>, std::allocator<std::pair<int const, std::vector<impala::TScanRangeParams, std::allocator<impala::TScanRangeParams> > > > > > > >*) ()
      #15 0x0000000000a9979a in impala::SimpleScheduler::ComputeScanRangeAssignment(impala::BackendConfig const&, int, impala::TReplicaPreference::type const*, bool, std::vector<impala::TScanRangeLocations, std::allocator<impala::TScanRangeLocations> > const&, std::vector<impala::TNetworkAddress, std::allocator<impala::TNetworkAddress> > const&, bool, impala::TQueryOptions const&, impala::RuntimeProfile::Counter*, boost::unordered::unordered_map<impala::TNetworkAddress, std::map<int, std::vector<impala::TScanRangeParams, std::allocator<impala::TScanRangeParams> >, std::less<int>, std::allocator<std::pair<int const, std::vector<impala::TScanRangeParams, std::allocator<impala::TScanRangeParams> > > > >, boost::hash<impala::TNetworkAddress>, std::equal_to<impala::TNetworkAddress>, std::allocator<std::pair<impala::TNetworkAddress const, std::map<int, std::vector<impala::TScanRangeParams, std::allocator<impala::TScanRangeParams> >, std::less<int>, std::allocator<std::pair<int const, std::vector<impala::TScanRangeParams, std::allocator<impala::TScanRangeParams> > > > > > > >*) ()
      #16 0x0000000000a99b44 in impala::SimpleScheduler::ComputeScanRangeAssignment(impala::TQueryExecRequest const&, impala::QuerySchedule*) ()
      #17 0x0000000000a99ddf in impala::SimpleScheduler::Schedule(impala::Coordinator*, impala::QuerySchedule*) ()
      #18 0x0000000000b26178 in impala::ImpalaServer::QueryExecState::ExecQueryOrDmlRequest(impala::TQueryExecRequest const&) ()
      #19 0x0000000000b28d84 in impala::ImpalaServer::QueryExecState::Exec(impala::TExecRequest*) ()
      #20 0x0000000000ad4336 in impala::ImpalaServer::ExecuteInternal(impala::TQueryCtx const&, std::shared_ptr<impala::ImpalaServer::SessionState>, bool*, std::shared_ptr<impala::ImpalaServer::QueryExecState>*) ()
      #21 0x0000000000ad9ba8 in impala::ImpalaServer::Execute(impala::TQueryCtx*, std::shared_ptr<impala::ImpalaServer::SessionState>, std::shared_ptr<impala::ImpalaServer::QueryExecState>*) ()
      #22 0x0000000000b0834b in impala::ImpalaServer::ExecuteStatement(apache::hive::service::cli::thrift::TExecuteStatementResp&, apache::hive::service::cli::thrift::TExecuteStatementReq const&) ()
      #23 0x0000000000b453e0 in impala::ChildQuery::ExecAndFetch() ()
      #24 0x0000000000b1e973 in impala::ImpalaServer::QueryExecState::ExecChildQueries() ()
      #25 0x0000000000bf5ad9 in impala::Thread::SuperviseThread(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*) ()
      #26 0x0000000000bf6474 in boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(std::string const&, std::string const&, boost::function<void ()>, impala::Promise<long>*), boost::_bi::list4<boost::_bi::value<std::string>, boost::_bi::value<std::string>, boost::_bi::value<boost::function<void ()> >, boost::_bi::value<impala::Promise<long>*> > > >::run() ()
      #27 0x0000000000e5c3aa in ?? ()
      #28 0x00007f387e0c7aa1 in start_thread () from sysroot/lib64/libpthread.so.0
      #29 0x00007f387de14aad in ?? () from sysroot/lib64/libc.so.6
      #30 0x0000000000000000 in ?? ()
      

        Issue Links

          Activity

          Hide
          kwho Michael Ho added a comment -

          Hi Lars Volker, Impalad is still crashing the same way after commit 96d98abff52e59742469f9d5a86018506e00f88f

          (gdb) bt
          #0  0x00000036ed232625 in raise () from /lib64/libc.so.6
          #1  0x00000036ed233e05 in abort () from /lib64/libc.so.6
          #2  0x00007f055bfc4a55 in os::abort(bool) () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
          #3  0x00007f055c144f87 in VMError::report_and_die() () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
          #4  0x00007f055c14550e in crash_handler(int, siginfo*, void*) () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
          #5  0x00007f055bfc3bf2 in os::Linux::chained_handler(int, siginfo*, void*) () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
          #6  0x00007f055bfc98d6 in JVM_handle_linux_signal () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
          #7  <signal handler called>
          #8  0x00007f055bfbb511 in os::is_first_C_frame(frame*) () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
          #9  0x00007f055c14367d in VMError::report(outputStream*) () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
          #10 0x00007f055c144b8a in VMError::report_and_die() () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
          #11 0x00007f055bfc996f in JVM_handle_linux_signal () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so
          #12 <signal handler called>
          #13 impala::SimpleScheduler::AssignmentCtx::GetBackendRank (this=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/scheduling/simple-scheduler.cc:881
          #14 0x0000000000a721fc in impala::SimpleScheduler::AssignmentCtx::RecordScanRangeAssignment (this=0x7f04e74c3790, backend=..., node_id=0, host_list=..., scan_range_locations=..., assignment=0x8b027c8)
              at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/scheduling/simple-scheduler.cc:925
          #15 0x0000000000a75986 in impala::SimpleScheduler::ComputeScanRangeAssignment (this=0x31f7260, backend_config=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/scheduling/simple-scheduler.cc:684
          #16 0x0000000000a75de9 in impala::SimpleScheduler::ComputeScanRangeAssignment (this=0x31f7260, schedule=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/scheduling/simple-scheduler.cc:332
          #17 0x0000000000a76048 in impala::SimpleScheduler::Schedule (this=0x31f7260, schedule=0x9dd0680) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/scheduling/simple-scheduler.cc:775
          #18 0x0000000000af4ac7 in impala::ImpalaServer::QueryExecState::ExecQueryOrDmlRequest (this=0x88f8800, query_exec_request=...)
              at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/query-exec-state.cc:438
          #19 0x0000000000afc2b4 in impala::ImpalaServer::QueryExecState::Exec (this=0x88f8800, exec_request=0x7f04e74c4c10) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/query-exec-state.cc:154
          #20 0x0000000000aaebae in impala::ImpalaServer::ExecuteInternal (this=0x7ebd000, query_ctx=..., session_state=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/impala-server.cc:814
          #21 0x0000000000ab4658 in impala::ImpalaServer::Execute (this=0x7ebd000, query_ctx=0x7f04e74c6260, session_state=..., exec_state=0x7f04e74c6220)
              at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/impala-server.cc:761
          #22 0x0000000000aeeb97 in impala::ImpalaServer::query (this=0x7ebd000, query_handle=..., query=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/impala-beeswax-server.cc:66
          #23 0x0000000000d35045 in beeswax::BeeswaxServiceProcessor::process_query (this=0x9c3a120, seqid=0, iprot=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:2979
          #24 0x0000000000d38344 in beeswax::BeeswaxServiceProcessor::dispatchCall (this=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:2952
          #25 0x000000000080376c in apache::thrift::TDispatchProcessor::process (this=0x9c3a120, in=..., out=..., connectionContext=0x9022180)
              at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/thrift-0.9.0-p8/include/thrift/TDispatchProcessor.h:121
          #26 0x0000000001b0d6fb in apache::thrift::server::TThreadPoolServer::Task::run() ()
          #27 0x0000000001af52b9 in apache::thrift::concurrency::ThreadManager::Worker::run() ()
          #28 0x00000000009f1189 in impala::ThriftThread::RunRunnable (this=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/rpc/thrift-thread.cc:64
          #29 0x00000000009f1be2 in operator() (function_obj_ptr=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/bind/mem_fn_template.hpp:280
          #30 operator()<boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<long unsigned int>*>, boost::_bi::list0> (function_obj_ptr=Unhandled dwarf expression opcode 0xf3
          )
              at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/bind/bind.hpp:392
          #31 operator() (function_obj_ptr=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/bind/bind_template.hpp:20
          #32 boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<unsigned long>*>, boost::_bi::list3<boost::_bi::value<impala::ThriftThread*>, boost::_bi::value<boost::shared_ptr<apache::thrift::concurrency::Runnable> >, boost::_bi::value<impala::Promise<unsigned long>*> > >, void>::invoke (function_obj_ptr=Unhandled dwarf expression opcode 0xf3
          )
              at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/function/function_template.hpp:153
          #33 0x0000000000bd22a9 in operator() (name=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/function/function_template.hpp:767
          #34 impala::Thread::SuperviseThread (name=Unhandled dwarf expression opcode 0xf3
          ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/util/thread.cc:317
          #35 0x0000000000bd2c84 in operator()<void (*)(const std::basic_string<char>&, const std::basic_string<char>&, boost::function<void()>, impala::Promise<long int>*), boost::_bi::list0> (this=0x8e74800)
              at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/bind/bind.hpp:457
          #36 operator() (this=0x8e74800) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/bind/bind_template.hpp:20
          #37 boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, boost::function<void()>, impala::Promise<long int>*), boost::_bi::list4<boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<boost::function<void()> >, boost::_bi::value<impala::Promise<long int>*> > > >::run(void) (this=0x8e74800)
              at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/thread/detail/thread.hpp:116
          #38 0x0000000000e17c5a in thread_proxy ()
          #39 0x00000036ed6079d1 in start_thread () from /lib64/libpthread.so.0
          #40 0x00000036ed2e88fd in clone () from /lib64/libc.so.6
          

          Please find the core (core.24260) at vb0204.halxg.cloudera.com.

          Show
          kwho Michael Ho added a comment - Hi Lars Volker , Impalad is still crashing the same way after commit 96d98abff52e59742469f9d5a86018506e00f88f (gdb) bt #0 0x00000036ed232625 in raise () from /lib64/libc.so.6 #1 0x00000036ed233e05 in abort () from /lib64/libc.so.6 #2 0x00007f055bfc4a55 in os::abort(bool) () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so #3 0x00007f055c144f87 in VMError::report_and_die() () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so #4 0x00007f055c14550e in crash_handler(int, siginfo*, void*) () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so #5 0x00007f055bfc3bf2 in os::Linux::chained_handler(int, siginfo*, void*) () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so #6 0x00007f055bfc98d6 in JVM_handle_linux_signal () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so #7 <signal handler called> #8 0x00007f055bfbb511 in os::is_first_C_frame(frame*) () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so #9 0x00007f055c14367d in VMError::report(outputStream*) () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so #10 0x00007f055c144b8a in VMError::report_and_die() () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so #11 0x00007f055bfc996f in JVM_handle_linux_signal () from /usr/java/jdk1.7.0_67-cloudera/jre/lib/amd64/server/libjvm.so #12 <signal handler called> #13 impala::SimpleScheduler::AssignmentCtx::GetBackendRank (this=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/scheduling/simple-scheduler.cc:881 #14 0x0000000000a721fc in impala::SimpleScheduler::AssignmentCtx::RecordScanRangeAssignment (this=0x7f04e74c3790, backend=..., node_id=0, host_list=..., scan_range_locations=..., assignment=0x8b027c8) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/scheduling/simple-scheduler.cc:925 #15 0x0000000000a75986 in impala::SimpleScheduler::ComputeScanRangeAssignment (this=0x31f7260, backend_config=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/scheduling/simple-scheduler.cc:684 #16 0x0000000000a75de9 in impala::SimpleScheduler::ComputeScanRangeAssignment (this=0x31f7260, schedule=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/scheduling/simple-scheduler.cc:332 #17 0x0000000000a76048 in impala::SimpleScheduler::Schedule (this=0x31f7260, schedule=0x9dd0680) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/scheduling/simple-scheduler.cc:775 #18 0x0000000000af4ac7 in impala::ImpalaServer::QueryExecState::ExecQueryOrDmlRequest (this=0x88f8800, query_exec_request=...) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/query-exec-state.cc:438 #19 0x0000000000afc2b4 in impala::ImpalaServer::QueryExecState::Exec (this=0x88f8800, exec_request=0x7f04e74c4c10) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/query-exec-state.cc:154 #20 0x0000000000aaebae in impala::ImpalaServer::ExecuteInternal (this=0x7ebd000, query_ctx=..., session_state=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/impala-server.cc:814 #21 0x0000000000ab4658 in impala::ImpalaServer::Execute (this=0x7ebd000, query_ctx=0x7f04e74c6260, session_state=..., exec_state=0x7f04e74c6220) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/impala-server.cc:761 #22 0x0000000000aeeb97 in impala::ImpalaServer::query (this=0x7ebd000, query_handle=..., query=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/service/impala-beeswax-server.cc:66 #23 0x0000000000d35045 in beeswax::BeeswaxServiceProcessor::process_query (this=0x9c3a120, seqid=0, iprot=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:2979 #24 0x0000000000d38344 in beeswax::BeeswaxServiceProcessor::dispatchCall (this=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:2952 #25 0x000000000080376c in apache::thrift::TDispatchProcessor::process (this=0x9c3a120, in=..., out=..., connectionContext=0x9022180) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/thrift-0.9.0-p8/include/thrift/TDispatchProcessor.h:121 #26 0x0000000001b0d6fb in apache::thrift::server::TThreadPoolServer::Task::run() () #27 0x0000000001af52b9 in apache::thrift::concurrency::ThreadManager::Worker::run() () #28 0x00000000009f1189 in impala::ThriftThread::RunRunnable (this=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/rpc/thrift-thread.cc:64 #29 0x00000000009f1be2 in operator() (function_obj_ptr=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/bind/mem_fn_template.hpp:280 #30 operator()<boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<long unsigned int>*>, boost::_bi::list0> (function_obj_ptr=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/bind/bind.hpp:392 #31 operator() (function_obj_ptr=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/bind/bind_template.hpp:20 #32 boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf2<void, impala::ThriftThread, boost::shared_ptr<apache::thrift::concurrency::Runnable>, impala::Promise<unsigned long>*>, boost::_bi::list3<boost::_bi::value<impala::ThriftThread*>, boost::_bi::value<boost::shared_ptr<apache::thrift::concurrency::Runnable> >, boost::_bi::value<impala::Promise<unsigned long>*> > >, void>::invoke (function_obj_ptr=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/function/function_template.hpp:153 #33 0x0000000000bd22a9 in operator() (name=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/function/function_template.hpp:767 #34 impala::Thread::SuperviseThread (name=Unhandled dwarf expression opcode 0xf3 ) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/be/src/util/thread.cc:317 #35 0x0000000000bd2c84 in operator()<void (*)(const std::basic_string<char>&, const std::basic_string<char>&, boost::function<void()>, impala::Promise<long int>*), boost::_bi::list0> (this=0x8e74800) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/bind/bind.hpp:457 #36 operator() (this=0x8e74800) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/bind/bind_template.hpp:20 #37 boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, boost::function<void()>, impala::Promise<long int>*), boost::_bi::list4<boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<boost::function<void()> >, boost::_bi::value<impala::Promise<long int>*> > > >::run(void) (this=0x8e74800) at /data/jenkins/workspace/impala-private-build-binaries/repos/Impala/toolchain/boost-1.57.0/include/boost/thread/detail/thread.hpp:116 #38 0x0000000000e17c5a in thread_proxy () #39 0x00000036ed6079d1 in start_thread () from /lib64/libpthread.so.0 #40 0x00000036ed2e88fd in clone () from /lib64/libc.so.6 Please find the core (core.24260) at vb0204.halxg.cloudera.com.
          Hide
          lv Lars Volker added a comment -

          Michael Ho - Thanks for reporting this. I opened IMPALA-4540 to track this new issue.

          Show
          lv Lars Volker added a comment - Michael Ho - Thanks for reporting this. I opened IMPALA-4540 to track this new issue.
          Hide
          lv Lars Volker added a comment -

          IMPALA-4494: Fix crash in SimpleScheduler

          The scheduler maintains a local list of active backends, which is
          updated through messages from the statestore. Even the local backend
          enters this list by registering with the statestore and being included
          in a statestore update message. Thus, during restarts it can happen that
          a query gets scheduled with exec_at_coord set to true, while the local
          backend has not been registered with the scheduler. In this case the IP
          address lookup in the internal BackendConfig fails and an empty IP
          address is returned, leading to a nullptr dereference down the line.

          This change adds an additional check when handling updates from the
          statestore to make sure that the backend config always contains the
          local backend. It also changes scheduling when exec_at_coord is true to
          always use the local backend, irrespective of whether it is present in
          the backend config.

          Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f
          Reviewed-on: http://gerrit.cloudera.org:8080/5127
          Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com>
          Tested-by: Internal Jenkins

          Show
          lv Lars Volker added a comment - IMPALA-4494 : Fix crash in SimpleScheduler The scheduler maintains a local list of active backends, which is updated through messages from the statestore. Even the local backend enters this list by registering with the statestore and being included in a statestore update message. Thus, during restarts it can happen that a query gets scheduled with exec_at_coord set to true, while the local backend has not been registered with the scheduler. In this case the IP address lookup in the internal BackendConfig fails and an empty IP address is returned, leading to a nullptr dereference down the line. This change adds an additional check when handling updates from the statestore to make sure that the backend config always contains the local backend. It also changes scheduling when exec_at_coord is true to always use the local backend, irrespective of whether it is present in the backend config. Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f Reviewed-on: http://gerrit.cloudera.org:8080/5127 Reviewed-by: Bharath Vissapragada <bharathv@cloudera.com> Tested-by: Internal Jenkins
          Hide
          lv Lars Volker added a comment -

          The fix for this issue had a flaw, which is fixed in IMPALA-4540

          Show
          lv Lars Volker added a comment - The fix for this issue had a flaw, which is fixed in IMPALA-4540

            People

            • Assignee:
              lv Lars Volker
              Reporter:
              lv Lars Volker
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development