Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-71

Injecting failure during SORT_NODE/PREPARE causes impalad to core dump in impala::Coordinator::CancelRemoteFragments (this=0x5f1bc00) at coordinator.cc:855

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • Impala 0.6
    • Impala 0.7
    • None
    • None

    Description

      Impalad core dumps when a failure is injected during SORT_NODE's prepare phase (node id 3 and 7 for the given query). This might happen for other plan nodes as well.

      set debug_action=3:PREPARE:FAIL;
      select a.int_col, count(b.int_col) int_sum from hbasealltypesagg a join (select * from alltypes where year=2009 and month=1 order by int_col limit 25
      union all
      select * from alltypes where year=2009 and month=2 limit 10) b on (a.int_col = b.int_col)
      group by a.int_col
      order by int_sum
      limit 20
      
      (gdb) bt
      #0  0x00007f02ef279425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
      #1  0x00007f02ef27cb8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
      #2  0x00007f02ee611727 in os::abort(bool) () from /usr/lib/jvm/java-6-sun/jre/lib/amd64/server/libjvm.so
      #3  0x00007f02ee764cc8 in VMError::report_and_die() () from /usr/lib/jvm/java-6-sun/jre/lib/amd64/server/libjvm.so
      #4  0x00007f02ee6180e5 in JVM_handle_linux_signal () from /usr/lib/jvm/java-6-sun/jre/lib/amd64/server/libjvm.so
      #5  0x00007f02ee6143ee in signalHandler(int, siginfo*, void*) () from /usr/lib/jvm/java-6-sun/jre/lib/amd64/server/libjvm.so
      #6  <signal handler called>
      #7  0x00007f02efe1de84 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
      #8  0x0000000000a41433 in boost::mutex::lock (this=0x320) at /usr/include/boost/thread/pthread/mutex.hpp:52
      #9  0x0000000000a41b5a in boost::lock_guard<boost::mutex>::lock_guard (this=0x7f02d9a5e6f0, m_=...) at /usr/include/boost/thread/locks.hpp:257
      #10 0x0000000000ba871f in impala::Coordinator::CancelRemoteFragments (this=0x5f1bc00) at /home/lskuff/dev/Impala/be/src/runtime/coordinator.cc:855
      #11 0x0000000000ba866e in impala::Coordinator::CancelInternal (this=0x5f1bc00) at /home/lskuff/dev/Impala/be/src/runtime/coordinator.cc:842
      #12 0x0000000000ba3f0c in impala::Coordinator::Exec (this=0x5f1bc00, query_id=..., request=0x5086550, query_options=...)
          at /home/lskuff/dev/Impala/be/src/runtime/coordinator.cc:416
      #13 0x0000000000a4f0de in impala::ImpalaServer::QueryExecState::Exec (this=0x5086000, exec_request=0x7f02d9a5f350) at /home/lskuff/dev/Impala/be/src/service/impala-server.cc:156
      #14 0x0000000000a5b455 in impala::ImpalaServer::ExecuteInternal (this=0x380a380, request=..., session_key=..., registered_exec_state=0x7f02d9a5f77f, exec_state=0x7f02d9a5f840)
          at /home/lskuff/dev/Impala/be/src/service/impala-server.cc:948
      #15 0x0000000000a5aff0 in impala::ImpalaServer::Execute (this=0x380a380, request=..., session_key=..., exec_state=0x7f02d9a5f840)
          at /home/lskuff/dev/Impala/be/src/service/impala-server.cc:904
      #16 0x0000000000ad04a6 in impala::ImpalaServer::query (this=0x380a380, query_handle=..., query=...) at /home/lskuff/dev/Impala/be/src/service/impala-beeswax-server.cc:148
      #17 0x0000000000c81418 in beeswax::BeeswaxServiceProcessor::process_query (this=0x29054a0, seqid=0, iprot=0x3c84800, oprot=0x3c84b00, callContext=0x5e99ef8)
          at /home/lskuff/dev/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:2927
      #18 0x0000000000c8114f in beeswax::BeeswaxServiceProcessor::dispatchCall (this=0x29054a0, iprot=0x3c84800, oprot=0x3c84b00, fname=..., seqid=0, callContext=0x5e99ef8)
          at /home/lskuff/dev/Impala/be/generated-sources/gen-cpp/BeeswaxService.cpp:2900
      #19 0x0000000000c6fef3 in impala::ImpalaServiceProcessor::dispatchCall (this=0x29054a0, iprot=0x3c84800, oprot=0x3c84b00, fname=..., seqid=0, callContext=0x5e99ef8)
          at /home/lskuff/dev/Impala/be/generated-sources/gen-cpp/ImpalaService.cpp:884
      #20 0x0000000000a651fc in apache::thrift::TDispatchProcessor::process (this=0x29054a0, in=..., out=..., connectionContext=0x5e99ef8)
          at /home/lskuff/dev/Impala/thirdparty/thrift-0.9.0/build/include/thrift/TDispatchProcessor.h:121
      #21 0x00000000015c586d in apache::thrift::server::TThreadPoolServer::Task::run (this=0x3191aa0) at src/thrift/server/TThreadPoolServer.cpp:70
      #22 0x00000000015b60af in apache::thrift::concurrency::ThreadManager::Task::run (this=0x3c84c80) at src/thrift/concurrency/ThreadManager.cpp:187
      #23 0x00000000015b8b09 in apache::thrift::concurrency::ThreadManager::Worker::run (this=0x40e3350) at src/thrift/concurrency/ThreadManager.cpp:316
      #24 0x00000000015cac46 in apache::thrift::concurrency::PthreadThread::threadMain (arg=0x5071760) at src/thrift/concurrency/PosixThreadFactory.cpp:208
      #25 0x00007f02efe1be9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
      #26 0x00007f02ef336cbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      Test Vector

      failpoints.py::TestFailpoints::()::test_failpoints[action: FAIL | target_node: ('SORT_NODE', [3, 7]) | exec_option: {'disable_codegen': False, 'batch_size': 0, 'num_nodes': 0} | table_format: text/none | location: PREPARE]
      

      Attachments

        Activity

          People

            skye Skye Wanderman-Milne
            lskuff Lenni Kuff
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: