Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4532

Crashes in impala::ThreadCounterMeasurement::Stop()

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:

      Description

      Hit the following crash when running exhaustive test. It looks as if the execution somehow jumped to 0xffffffff due to broken vptr table.

      CORE: ./core.1479914598.27789.impalad
      BINARY: ./be/build/latest/service/impalad
      Core was generated by `/data/jenkins/workspace/impala-private-build-and-test/repos/Impala/be/build/lat'.
      Program terminated with signal 6, Aborted.
      #0  0x0000003861a328e5 in raise () from /lib64/libc.so.6
      To enable execution of this file add
              add-auto-load-safe-path /data/jenkins/workspace/impala-private-build-and-test/Impala-Toolchain/gcc-4.9.2/lib64/libstdc++.so.6.0.20-gdb.py
      line to your configuration file "/var/lib/jenkins/.gdbinit".
      To completely disable this security protection add
              set auto-load safe-path /
      line to your configuration file "/var/lib/jenkins/.gdbinit".
      For more information about this security protection see the
      "Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
              info "(gdb)Auto-loading safe path"
      #0  0x0000003861a328e5 in raise () from /lib64/libc.so.6
      #1  0x0000003861a340c5 in abort () from /lib64/libc.so.6
      #2  0x00007f8ed7a6cc55 in os::abort(bool) () from /opt/toolchain/sun-jdk-64bit-1.7.0.75/jre/lib/amd64/server/libjvm.so
      #3  0x00007f8ed7beecd7 in VMError::report_and_die() () from /opt/toolchain/sun-jdk-64bit-1.7.0.75/jre/lib/amd64/server/libjvm.so
      #4  0x00007f8ed7a71b6f in JVM_handle_linux_signal () from /opt/toolchain/sun-jdk-64bit-1.7.0.75/jre/lib/amd64/server/libjvm.so
      #5  <signal handler called>
      #6  0x00000000ffffffff in ?? ()
      #7  0x0000000001682cac in impala::ThreadCounterMeasurement::Stop (this=0x7f8e056ad780) at /data/jenkins/workspace/impala-private-build-and-test/repos/Impala/be/src/util/runtime-profile-counters.h:517
      #8  0x0000000001682dd8 in impala::ThreadCounterMeasurement::~ThreadCounterMeasurement (this=0x7f8e056ad780, __in_chrg=<value optimized out>) at /data/jenkins/workspace/impala-private-build-and-test/repos/Impala/be/src/util/runtime-profile-counters.h:527
      #9  0x00000000017c0902 in impala::BlockingJoinNode::ProcessBuildInputAsync (this=0x1dde3f80, state=0xc135b00, build_sink=0xd848a00, status=0x7f8df40198e0) at /data/jenkins/workspace/impala-private-build-and-test/repos/Impala/be/src/exec/blocking-join-node.cc:167
      #10 0x00000000017c4059 in boost::_mfi::mf3<void, impala::BlockingJoinNode, impala::RuntimeState*, impala::DataSink*, impala::Promise<impala::Status>*>::operator() (this=0xa1bf500, p=0x1dde3f80, a1=0xc135b00, a2=0xd848a00, a3=0x7f8df40198e0) at /data/jenkins/workspace/impala-private-build-and-test/Impala-Toolchain/boost-1.57.0/include/boost/bind/mem_fn_template.hpp:393
      #11 0x00000000017c3f05 in boost::_bi::list4<boost::_bi::value<impala::BlockingJoinNode*>, boost::_bi::value<impala::RuntimeState*>, boost::_bi::value<impala::DataSink*>, boost::_bi::value<impala::Promise<impala::Status>*> >::operator()<boost::_mfi::mf3<void, impala::BlockingJoinNode, impala::RuntimeState*, impala::DataSink*, impala::Promise<impala::Status>*>, boost::_bi::list0> (this=0xa1bf510, f=..., a=...) at /data/jenkins/workspace/impala-private-build-and-test/Impala-Toolchain/boost-1.57.0/include/boost/bind/bind.hpp:457
      #12 0x00000000017c3c51 in boost::_bi::bind_t<void, boost::_mfi::mf3<void, impala::BlockingJoinNode, impala::RuntimeState*, impala::DataSink*, impala::Promise<impala::Status>*>, boost::_bi::list4<boost::_bi::value<impala::BlockingJoinNode*>, boost::_bi::value<impala::RuntimeState*>, boost::_bi::value<impala::DataSink*>, boost::_bi::value<impala::Promise<impala::Status>*> > >::operator() (this=0xa1bf500) at /data/jenkins/workspace/impala-private-build-and-test/Impala-Toolchain/boost-1.57.0/include/boost/bind/bind_template.hpp:20
      #13 0x00000000017c394a in boost::detail::function::void_function_obj_invoker0<boost::_bi::bind_t<void, boost::_mfi::mf3<void, impala::BlockingJoinNode, impala::RuntimeState*, impala::DataSink*, impala::Promise<impala::Status>*>, boost::_bi::list4<boost::_bi::value<impala::BlockingJoinNode*>, boost::_bi::value<impala::RuntimeState*>, boost::_bi::value<impala::DataSink*>, boost::_bi::value<impala::Promise<impala::Status>*> > >, void>::invoke (function_obj_ptr=...) at /data/jenkins/workspace/impala-private-build-and-test/Impala-Toolchain/boost-1.57.0/include/boost/function/function_template.hpp:153
      #14 0x0000000001340c56 in boost::function0<void>::operator() (this=0x7f8e056adc40) at /data/jenkins/workspace/impala-private-build-and-test/Impala-Toolchain/boost-1.57.0/include/boost/function/function_template.hpp:767
      #15 0x00000000015ed929 in impala::Thread::SuperviseThread (name=..., category=..., functor=..., thread_started=0x7f8df40196d0) at /data/jenkins/workspace/impala-private-build-and-test/repos/Impala/be/src/util/thread.cc:317
      #16 0x00000000015f4902 in boost::_bi::list4<boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<boost::function<void()> >, boost::_bi::value<impala::Promise<long int>*> >::operator()<void (*)(const std::basic_string<char>&, const std::basic_string<char>&, boost::function<void()>, impala::Promise<long int>*), boost::_bi::list0>(boost::_bi::type<void>, void (*&)(const std::basic_string<char, std::char_traits<char>, std::allocator<char> > &, const std::basic_string<char, std::char_traits<char>, std::allocator<char> > &, boost::function<void()>, impala::Promise<long> *), boost::_bi::list0 &, int) (this=0xa2161c0, f=@0xa2161b8, a=...) at /data/jenkins/workspace/impala-private-build-and-test/Impala-Toolchain/boost-1.57.0/include/boost/bind/bind.hpp:457
      #17 0x00000000015f4845 in boost::_bi::bind_t<void, void (*)(const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, boost::function<void()>, impala::Promise<long int>*), boost::_bi::list4<boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<boost::function<void()> >, boost::_bi::value<impala::Promise<long int>*> > >::operator()(void) (this=0xa2161b8) at /data/jenkins/workspace/impala-private-build-and-test/Impala-Toolchain/boost-1.57.0/include/boost/bind/bind_template.hpp:20
      #18 0x00000000015f47a0 in boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, const std::basic_string<char, std::char_traits<char>, std::allocator<char> >&, boost::function<void()>, impala::Promise<long int>*), boost::_bi::list4<boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, boost::_bi::value<boost::function<void()> >, boost::_bi::value<impala::Promise<long int>*> > > >::run(void) (this=0xa216000) at /data/jenkins/workspace/impala-private-build-and-test/Impala-Toolchain/boost-1.57.0/include/boost/thread/detail/thread.hpp:116
      #19 0x0000000001a42cfa in thread_proxy ()
      #20 0x0000003861e07851 in start_thread () from /lib64/libpthread.so.0
      #21 0x0000003861ae894d in clone () from /lib64/libc.so.6
      

        Activity

        Hide
        kwho Michael Ho added a comment -

        Anuj Phadke, assigning to you for now as your recent change introduced this thread counter. I stared at the change for a little bit but didn't see anything obviously wrong.

        Show
        kwho Michael Ho added a comment - Anuj Phadke , assigning to you for now as your recent change introduced this thread counter. I stared at the change for a little bit but didn't see anything obviously wrong.
        Hide
        kwho Michael Ho added a comment -

        The output from minidump suggests we jumped to 0xffffffff.

        Crash reason:  SIGSEGV
        Crash address: 0xffffffff
        Process uptime: not available
        
        Thread 438 (crashed)
         0  0xffffffff
            rax = 0x0000000019d1cc00   rdx = 0x000000000098956a
            rcx = 0x0000000000000000   rbx = 0x00000000ffffffff
            rsi = 0x000000000098956a   rdi = 0x0000000019d1cc00
            rbp = 0x00007f8e056ad730   rsp = 0x00007f8e056ad638
             r8 = 0x00007f8df40198e0    r9 = 0x0000000000000002
            r10 = 0x000000007fffffff   r11 = 0x0000000000000206
            r12 = 0x000000000c135b00   r13 = 0x000000000d848a00
            r14 = 0x00007f8df40198e0   r15 = 0x0000000000000003
            rip = 0x00000000ffffffff
            Found by: given as instruction pointer in context
         1  impalad + 0x1282dd8
            rbp = 0x00007f8e056ad750   rsp = 0x00007f8e056ad740
            rip = 0x0000000001682dd8
            Found by: previous frame's frame pointer
         2  impalad + 0x13c0902
            rbp = 0x00007f8e056ad880   rsp = 0x00007f8e056ad760
            rip = 0x00000000017c0902
            Found by: previous frame's frame pointer
         3  impalad + 0x13c4059
            rbp = 0x00007f8e056ad8c0   rsp = 0x00007f8e056ad890
            rip = 0x00000000017c4059
            Found by: previous frame's frame pointer
         4  impalad + 0x13c3f05
            rbp = 0x00007f8e056ad910   rsp = 0x00007f8e056ad8d0
            rip = 0x00000000017c3f05
            Found by: previous frame's frame pointer
         5  impalad + 0x13c3c51
            rbp = 0x00007f8e056ad960   rsp = 0x00007f8e056ad920
            rip = 0x00000000017c3c51
            Found by: previous frame's frame pointer
         6  impalad + 0x13c394a
            rbp = 0x00007f8e056ad990   rsp = 0x00007f8e056ad970
            rip = 0x00000000017c394a
            Found by: previous frame's frame pointer
         7  impalad + 0xf40c56
            rbp = 0x00007f8e056ad9d0   rsp = 0x00007f8e056ad9a0
            rip = 0x0000000001340c56
            Found by: previous frame's frame pointer
         8  impalad + 0x11ed929
            rbp = 0x00007f8e056adc10   rsp = 0x00007f8e056ad9e0
            rip = 0x00000000015ed929
            Found by: previous frame's frame pointer
         9  impalad + 0x11f4902
            rbp = 0x00007f8e056adc80   rsp = 0x00007f8e056adc20
            rip = 0x00000000015f4902
            Found by: previous frame's frame pointer
        10  impalad + 0x11f4845
            rbp = 0x00007f8e056adcd0   rsp = 0x00007f8e056adc90
            rip = 0x00000000015f4845
            Found by: previous frame's frame pointer
        

        Inspecting the stack suggests we were jumping from impala::ThreadCounterMeasurement::Stop().
        In fact, matches the following line in impala::ThreadCounterMeasurement::Stop()

            counters_->total_time_->Add(sw_.ElapsedTime());
        

        It appears though that counters_->total_time_ is corrupted:

        (gdb) p *(this->counters_)
        $2 = {total_time_ = 0x19d1cc00, user_time_ = 0xf531a20, sys_time_ = 0x9d04080, voluntary_context_switches_ = 0x9d04800, involuntary_context_switches_ = 0xf530040}
        (gdb) p *(this->counters_->total_time_)
        $3 = {_vptr.Counter = 0x19d1d830, value_ = {value_ = 18}, unit_ = 4294967295}  <<--- unit_ is wrong
        (gdb) x/10gx 0x19d1d830                        <<---_vptr.Counter contains garbage ?
        0x19d1d830:	0x0000000019d1d980	0x0000000000000012
        0x19d1d840:	0x00000000ffffffff	0x6c61576c61746f54
        0x19d1d850:	0x69546b636f6c436c	0x000000002d00656d
        0x19d1d860:	0x0000000019d1ccc0	0x0000000019d1d530
        0x19d1d870:	0x0000000019d1ccc0	0x000000001f6946f0
        (gdb) x/10gx 0x19d1cc00                      
        0x19d1cc00:	0x0000000019d1d830	0x0000000000000012
        0x19d1cc10:	0x00000000ffffffff	0x3433636335343033
        

        Disassembly of impala::ThreadCounterMeasurement::Stop() suggests the code was trying to
        load the address of Add() from vptr table into %rbx and made an indirect call to it. The vptr table
        somehow contains garbage.

           0x1682c60 <impala::ThreadCounterMeasurement::Stop()+352>:	mov    -0xe8(%rbp),%rax
           0x1682c67 <impala::ThreadCounterMeasurement::Stop()+359>:	mov    0xb8(%rax),%rax // rax = counters_
           0x1682c6e <impala::ThreadCounterMeasurement::Stop()+366>:	mov    (%rax),%rax     // rax = counters_->total_time
           0x1682c71 <impala::ThreadCounterMeasurement::Stop()+369>:	mov    (%rax),%rax     // rax = counters_->total_time._vptr.Counter
           0x1682c74 <impala::ThreadCounterMeasurement::Stop()+372>:	add    $0x10,%rax
           0x1682c78 <impala::ThreadCounterMeasurement::Stop()+376>:	mov    (%rax),%rbx
           0x1682c7b <impala::ThreadCounterMeasurement::Stop()+379>:	mov    -0xe8(%rbp),%rax
           0x1682c82 <impala::ThreadCounterMeasurement::Stop()+386>:	add    $0x98,%rax
           0x1682c88 <impala::ThreadCounterMeasurement::Stop()+392>:	mov    %rax,%rdi
           0x1682c8b <impala::ThreadCounterMeasurement::Stop()+395>:	callq  0x11805da <impala::MonotonicStopWatch::ElapsedTime() const>
           0x1682c90 <impala::ThreadCounterMeasurement::Stop()+400>:	mov    %rax,%rdx
           0x1682c93 <impala::ThreadCounterMeasurement::Stop()+403>:	mov    -0xe8(%rbp),%rax
           0x1682c9a <impala::ThreadCounterMeasurement::Stop()+410>:	mov    0xb8(%rax),%rax
           0x1682ca1 <impala::ThreadCounterMeasurement::Stop()+417>:	mov    (%rax),%rax
           0x1682ca4 <impala::ThreadCounterMeasurement::Stop()+420>:	mov    %rdx,%rsi
           0x1682ca7 <impala::ThreadCounterMeasurement::Stop()+423>:	mov    %rax,%rdi
           0x1682caa <impala::ThreadCounterMeasurement::Stop()+426>:	callq  *%rbx
        => 0x1682cac <impala::ThreadCounterMeasurement::Stop()+428>:	mov    -0xe8(%rbp),%rax
           0x1682cb3 <impala::ThreadCounterMeasurement::Stop()+435>:	mov    0xb8(%rax),%rax
        
        Show
        kwho Michael Ho added a comment - The output from minidump suggests we jumped to 0xffffffff. Crash reason: SIGSEGV Crash address: 0xffffffff Process uptime: not available Thread 438 (crashed) 0 0xffffffff rax = 0x0000000019d1cc00 rdx = 0x000000000098956a rcx = 0x0000000000000000 rbx = 0x00000000ffffffff rsi = 0x000000000098956a rdi = 0x0000000019d1cc00 rbp = 0x00007f8e056ad730 rsp = 0x00007f8e056ad638 r8 = 0x00007f8df40198e0 r9 = 0x0000000000000002 r10 = 0x000000007fffffff r11 = 0x0000000000000206 r12 = 0x000000000c135b00 r13 = 0x000000000d848a00 r14 = 0x00007f8df40198e0 r15 = 0x0000000000000003 rip = 0x00000000ffffffff Found by: given as instruction pointer in context 1 impalad + 0x1282dd8 rbp = 0x00007f8e056ad750 rsp = 0x00007f8e056ad740 rip = 0x0000000001682dd8 Found by: previous frame's frame pointer 2 impalad + 0x13c0902 rbp = 0x00007f8e056ad880 rsp = 0x00007f8e056ad760 rip = 0x00000000017c0902 Found by: previous frame's frame pointer 3 impalad + 0x13c4059 rbp = 0x00007f8e056ad8c0 rsp = 0x00007f8e056ad890 rip = 0x00000000017c4059 Found by: previous frame's frame pointer 4 impalad + 0x13c3f05 rbp = 0x00007f8e056ad910 rsp = 0x00007f8e056ad8d0 rip = 0x00000000017c3f05 Found by: previous frame's frame pointer 5 impalad + 0x13c3c51 rbp = 0x00007f8e056ad960 rsp = 0x00007f8e056ad920 rip = 0x00000000017c3c51 Found by: previous frame's frame pointer 6 impalad + 0x13c394a rbp = 0x00007f8e056ad990 rsp = 0x00007f8e056ad970 rip = 0x00000000017c394a Found by: previous frame's frame pointer 7 impalad + 0xf40c56 rbp = 0x00007f8e056ad9d0 rsp = 0x00007f8e056ad9a0 rip = 0x0000000001340c56 Found by: previous frame's frame pointer 8 impalad + 0x11ed929 rbp = 0x00007f8e056adc10 rsp = 0x00007f8e056ad9e0 rip = 0x00000000015ed929 Found by: previous frame's frame pointer 9 impalad + 0x11f4902 rbp = 0x00007f8e056adc80 rsp = 0x00007f8e056adc20 rip = 0x00000000015f4902 Found by: previous frame's frame pointer 10 impalad + 0x11f4845 rbp = 0x00007f8e056adcd0 rsp = 0x00007f8e056adc90 rip = 0x00000000015f4845 Found by: previous frame's frame pointer Inspecting the stack suggests we were jumping from impala::ThreadCounterMeasurement::Stop() . In fact, matches the following line in impala::ThreadCounterMeasurement::Stop() counters_->total_time_->Add(sw_.ElapsedTime()); It appears though that counters_->total_time_ is corrupted: (gdb) p *(this->counters_) $2 = {total_time_ = 0x19d1cc00, user_time_ = 0xf531a20, sys_time_ = 0x9d04080, voluntary_context_switches_ = 0x9d04800, involuntary_context_switches_ = 0xf530040} (gdb) p *(this->counters_->total_time_) $3 = {_vptr.Counter = 0x19d1d830, value_ = {value_ = 18}, unit_ = 4294967295} <<--- unit_ is wrong (gdb) x/10gx 0x19d1d830 <<---_vptr.Counter contains garbage ? 0x19d1d830: 0x0000000019d1d980 0x0000000000000012 0x19d1d840: 0x00000000ffffffff 0x6c61576c61746f54 0x19d1d850: 0x69546b636f6c436c 0x000000002d00656d 0x19d1d860: 0x0000000019d1ccc0 0x0000000019d1d530 0x19d1d870: 0x0000000019d1ccc0 0x000000001f6946f0 (gdb) x/10gx 0x19d1cc00 0x19d1cc00: 0x0000000019d1d830 0x0000000000000012 0x19d1cc10: 0x00000000ffffffff 0x3433636335343033 Disassembly of impala::ThreadCounterMeasurement::Stop() suggests the code was trying to load the address of Add() from vptr table into %rbx and made an indirect call to it. The vptr table somehow contains garbage. 0x1682c60 <impala::ThreadCounterMeasurement::Stop()+352>: mov -0xe8(%rbp),%rax 0x1682c67 <impala::ThreadCounterMeasurement::Stop()+359>: mov 0xb8(%rax),%rax // rax = counters_ 0x1682c6e <impala::ThreadCounterMeasurement::Stop()+366>: mov (%rax),%rax // rax = counters_->total_time 0x1682c71 <impala::ThreadCounterMeasurement::Stop()+369>: mov (%rax),%rax // rax = counters_->total_time._vptr.Counter 0x1682c74 <impala::ThreadCounterMeasurement::Stop()+372>: add $0x10,%rax 0x1682c78 <impala::ThreadCounterMeasurement::Stop()+376>: mov (%rax),%rbx 0x1682c7b <impala::ThreadCounterMeasurement::Stop()+379>: mov -0xe8(%rbp),%rax 0x1682c82 <impala::ThreadCounterMeasurement::Stop()+386>: add $0x98,%rax 0x1682c88 <impala::ThreadCounterMeasurement::Stop()+392>: mov %rax,%rdi 0x1682c8b <impala::ThreadCounterMeasurement::Stop()+395>: callq 0x11805da <impala::MonotonicStopWatch::ElapsedTime() const> 0x1682c90 <impala::ThreadCounterMeasurement::Stop()+400>: mov %rax,%rdx 0x1682c93 <impala::ThreadCounterMeasurement::Stop()+403>: mov -0xe8(%rbp),%rax 0x1682c9a <impala::ThreadCounterMeasurement::Stop()+410>: mov 0xb8(%rax),%rax 0x1682ca1 <impala::ThreadCounterMeasurement::Stop()+417>: mov (%rax),%rax 0x1682ca4 <impala::ThreadCounterMeasurement::Stop()+420>: mov %rdx,%rsi 0x1682ca7 <impala::ThreadCounterMeasurement::Stop()+423>: mov %rax,%rdi 0x1682caa <impala::ThreadCounterMeasurement::Stop()+426>: callq *%rbx => 0x1682cac <impala::ThreadCounterMeasurement::Stop()+428>: mov -0xe8(%rbp),%rax 0x1682cb3 <impala::ThreadCounterMeasurement::Stop()+435>: mov 0xb8(%rax),%rax
        Hide
        tarmstrong Tim Armstrong added a comment -

        I hit this too - seems to be reproducible.

        Show
        tarmstrong Tim Armstrong added a comment - I hit this too - seems to be reproducible.
        Hide
        tarmstrong Tim Armstrong added a comment -

        I think the problem is that there's a race in ProcessBuildInputAsync() between the thread counters being updated and the runtime state being torn down. The problem is that once the 'status' promise is set, the query can start being torn down, but the counters aren't updated until after that.

        The race already existed, I think the timing may have just changed.

        Show
        tarmstrong Tim Armstrong added a comment - I think the problem is that there's a race in ProcessBuildInputAsync() between the thread counters being updated and the runtime state being torn down. The problem is that once the 'status' promise is set, the query can start being torn down, but the counters aren't updated until after that. The race already existed, I think the timing may have just changed.
        Hide
        kwho Michael Ho added a comment -

        Good point. The BlockingJoinNode::ProcessBuildInputAndOpenProbe() could proceed once the status is set and the rest of the fragment may have completed before the build side thread gets a chance to update the counter.

        Show
        kwho Michael Ho added a comment - Good point. The BlockingJoinNode::ProcessBuildInputAndOpenProbe() could proceed once the status is set and the rest of the fragment may have completed before the build side thread gets a chance to update the counter.
        Hide
        anujphadke Anuj Phadke added a comment -

        Michael Ho - I see that
        https://github.com/apache/incubator-impala/commit/f2ce30e6448fb1a5c1d5721c0de1c768261770b3
        is already merged. Thanks for looking into it.
        I have reassigned it back to you. Can we mark it as resolved?

        I read your comments on the review -
        A better fix could be to make the main thread call Join() on the build thread.
        I can work on the follow up patch.

        Show
        anujphadke Anuj Phadke added a comment - Michael Ho - I see that https://github.com/apache/incubator-impala/commit/f2ce30e6448fb1a5c1d5721c0de1c768261770b3 is already merged. Thanks for looking into it. I have reassigned it back to you. Can we mark it as resolved? I read your comments on the review - A better fix could be to make the main thread call Join() on the build thread. I can work on the follow up patch.
        Hide
        henryr Henry Robinson added a comment -

        Anuj Phadke - please try to avoid linking to non-public resources on the Impala JIRA, thanks!

        Show
        henryr Henry Robinson added a comment - Anuj Phadke - please try to avoid linking to non-public resources on the Impala JIRA, thanks!
        Hide
        kwho Michael Ho added a comment -

        https://github.com/apache/incubator-impala/commit/f2ce30e6448fb1a5c1d5721c0de1c768261770b3

        IMPALA-4532: Fix use-after-free in ProcessBuildInputAsync()
        Once the Promise (of Status) is set in ProcessBuildInputAsync(),
        the main thread in ProcessBuildInputAndOpenProbe() may proceed
        to finish up the query and free RuntimeState. So, it's unsafe
        to access the RuntimeState afterwards. Commit bb1c633 broke that
        assumption (which is arguably fragile). This change fixes the
        problem by adding a scope for the that counter to avoid the
        use-after-free problem.

        Change-Id: I6bfd094e2e9500f1b7843486f3f745cb921764d4
        Reviewed-on: http://gerrit.cloudera.org:8080/5246
        Reviewed-by: Dan Hecht <dhecht@cloudera.com>
        Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
        Tested-by: Internal Jenkins

        Show
        kwho Michael Ho added a comment - https://github.com/apache/incubator-impala/commit/f2ce30e6448fb1a5c1d5721c0de1c768261770b3 IMPALA-4532 : Fix use-after-free in ProcessBuildInputAsync() Once the Promise (of Status) is set in ProcessBuildInputAsync(), the main thread in ProcessBuildInputAndOpenProbe() may proceed to finish up the query and free RuntimeState. So, it's unsafe to access the RuntimeState afterwards. Commit bb1c633 broke that assumption (which is arguably fragile). This change fixes the problem by adding a scope for the that counter to avoid the use-after-free problem. Change-Id: I6bfd094e2e9500f1b7843486f3f745cb921764d4 Reviewed-on: http://gerrit.cloudera.org:8080/5246 Reviewed-by: Dan Hecht <dhecht@cloudera.com> Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com> Tested-by: Internal Jenkins

          People

          • Assignee:
            kwho Michael Ho
            Reporter:
            kwho Michael Ho
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development