Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-166

SimpleScheduler::HasLocalHost should hold host_map_lock_, causes intermittent crashes with concurrency.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 0.7
    • Impala 0.7
    • None
    • None

    Description

      The coordinator died while running the stress job on the jenkins cluster, the concurrency level was set to 20.

      <stack>
      ...
      ...
      #13 0x0000000000c5a727 in boost::unordered_map<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::list<impala::TNetworkAddress, std::allocator<impala::TNetworkAddress> >, boost::hash<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::list<impala::TNetworkAddress, std::allocator<impala::TNetworkAddress> > > > >::find (this=0x3305ce8, k="10.20.90.18") at /usr/include/boost/unordered/unordered_map.hpp:434
      #14 0x0000000000c5a3df in impala::SimpleScheduler::HasLocalHost (this=0x3305ce0, data_location=...) at /usr/src/debug/impala-0.7-SNAPSHOT/be/src/statestore/simple-scheduler.h:69
      #15 0x0000000000be62a2 in impala::Coordinator::ComputeScanRangeAssignment (this=0xf350000, node_id=0, locations=std::vector of length 590, capacity 590 =

      {...}

      , exec_at_coord=false, params=..., assignment=0x95446c8)
      at /usr/src/debug/impala-0.7-SNAPSHOT/be/src/runtime/coordinator.cc:1348
      #16 0x0000000000be5d81 in impala::Coordinator::ComputeScanRangeAssignment (this=0xf350000, exec_request=...) at /usr/src/debug/impala-0.7-SNAPSHOT/be/src/runtime/coordinator.cc:1314
      #17 0x0000000000bdb0d6 in impala::Coordinator::Exec (this=0xf350000, query_id=..., request=0x5e8f7a0, query_options=...) at /usr/src/debug/impala-0.7-SNAPSHOT/be/src/runtime/coordinator.cc:293
      #18 0x0000000000a5c4d0 in impala::ImpalaServer::QueryExecState::Exec (this=0x5e8f200, exec_request=0x7fd6428ed220) at /usr/src/debug/impala-0.7-SNAPSHOT/be/src/service/impala-server.cc:159
      #19 0x0000000000a6b2a4 in impala::ImpalaServer::ExecuteInternal (this=0x5df3400, request=..., session_key="10.20.90.10:45033", registered_exec_state=0x7fd6428ed6c7, exec_state=0x7fd6428ed7b0)
      at /usr/src/debug/impala-0.7-SNAPSHOT/be/src/service/impala-server.cc:1011
      #20 0x0000000000a6ad28 in impala::ImpalaServer::Execute (this=0x5df3400, request=..., session_key="10.20.90.10:45033", exec_state=0x7fd6428ed7b0) at /usr/src/debug/impala-0.7-SNAPSHOT/be/src/service/impala-server.cc:967
      #21 0x0000000000aed960 in impala::ImpalaServer::query (this=0x5df3400, query_handle=..., query=...) at /usr/src/debug/impala-0.7-SNAPSHOT/be/src/service/impala-beeswax-server.cc:148
      #22 0x0000000000cafc87 in beeswax::BeeswaxServiceProcessor::process_query (this=0x335b540, seqid=0, iprot=0xe9587e40, oprot=0xa885880, callContext=0x32f2a08)
      at /usr/src/debug/impala-0.7-SNAPSHOT/be/generated-sources/gen-cpp/BeeswaxService.cpp:2979
      #23 0x0000000000cafa2c in beeswax::BeeswaxServiceProcessor::dispatchCall (this=0x335b540, iprot=0xe9587e40, oprot=0xa885880, fname="query", seqid=0, callContext=0x32f2a08)
      at /usr/src/debug/impala-0.7-SNAPSHOT/be/generated-sources/gen-cpp/BeeswaxService.cpp:2952
      #24 0x0000000000c9c195 in impala::ImpalaServiceProcessor::dispatchCall (this=0x335b540, iprot=0xe9587e40, oprot=0xa885880, fname="query", seqid=0, callContext=0x32f2a08)
      at /usr/src/debug/impala-0.7-SNAPSHOT/be/generated-sources/gen-cpp/ImpalaService.cpp:1143
      #25 0x0000000000a77ef2 in apache::thrift::TDispatchProcessor::process (this=0x335b540, in=..., out=..., connectionContext=0x32f2a08)
      at /usr/src/debug/impala-0.7-SNAPSHOT/thirdparty/thrift-0.9.0/build/include/thrift/TDispatchProcessor.h:121
      #26 0x00000000016e1689 in apache::thrift::server::TThreadPoolServer::Task::run() ()
      #27 0x00000000016d28bf in apache::thrift::concurrency::ThreadManager::Task::run() ()
      #28 0x00000000016d4804 in apache::thrift::concurrency::ThreadManager::Worker::run() ()
      #29 0x00000000016e7532 in apache::thrift::concurrency::PthreadThread::threadMain (arg=0x5179e60) at src/thrift/concurrency/PosixThreadFactory.cpp:208
      #30 0x00000035dfe07851 in start_thread (arg=0x7fd6428ee700) at pthread_create.c:301
      #31 0x00000035df6e767d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115
      </stack>

      f 14
      #14 0x0000000000c5a3df in impala::SimpleScheduler::HasLocalHost (this=0x3305ce0, data_location=...) at /usr/src/debug/impala-0.7-SNAPSHOT/be/src/statestore/simple-scheduler.h:69
      69 HostLocalityMap::iterator entry = host_map_.find(data_location.hostname);

      simple-scheduler.h:

      68 virtual bool HasLocalHost(const TNetworkAddress& data_location)

      { 69 HostLocalityMap::iterator entry = host_map_.find(data_location.hostname); 70 return (entry != host_map_.end()); 71 }

      Alan has a patch ready.

      Attachments

        Activity

          People

            alan@cloudera.com Alan Choi
            ishaan Ishaan Joshi
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: