Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-5291

statestore-test failed during exhaustive testing of ASF RELEASE build

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.9.0
    • Fix Version/s: Impala 2.9.0
    • Component/s: Backend
    • Labels:
    • Epic Color:
      ghx-label-3

      Description

      05:09:44 47/89 Test #47: statestore-test ..................***Exception: Other  1.41 sec
      05:09:44 Turning perftools heap leak checking off
      05:09:44 [==========] Running 2 tests from 2 test cases.
      05:09:44 [----------] Global test environment set-up.
      05:09:44 [----------] 1 test from StatestoreTest
      05:09:44 [ RUN      ] StatestoreTest.SmokeTest
      05:09:44 [       OK ] StatestoreTest.SmokeTest (59 ms)
      05:09:44 [----------] 1 test from StatestoreTest (59 ms total)
      05:09:44 
      05:09:44 [----------] 1 test from StatestoreSslTest
      05:09:44 [ RUN      ] StatestoreSslTest.SmokeTest
      05:09:44 [       OK ] StatestoreSslTest.SmokeTest (30 ms)
      05:09:44 [----------] 1 test from StatestoreSslTest (30 ms total)
      05:09:44 
      05:09:44 [----------] Global test environment tear-down
      05:09:44 [==========] 2 tests from 2 test cases ran. (89 ms total)
      05:09:44 [  PASSED  ] 2 tests.
      05:09:44 statestore-test: src/thrift/concurrency/Mutex.cpp:130: apache::thrift::concurrency::Mutex::impl::~impl(): Assertion `ret == 0' failed.
      05:09:44 Wrote minidump to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/be_tests/minidumps/statestore-test/589ec4ed-e95e-b5bd-0ae3724f-07f0d17a.dmp
      05:09:44 Wrote minidump to /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/be_tests/minidumps/statestore-test/589ec4ed-e95e-b5bd-0ae3724f-07f0d17a.dmp
      

      Backtrace:

      CORE: ./be/src/statestore/core.1494158984.31180.statestore-test
      BINARY: ./be/build/latest/statestore/statestore-test
      Core was generated by `/data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/be/build/re'.
      Program terminated with signal 6, Aborted.
      #0  0x00000032650328e5 in raise () from /lib64/libc.so.6
      To enable execution of this file add
      	add-auto-load-safe-path /data/jenkins/workspace/impala-umbrella-build-and-test/Impala-Toolchain/gcc-4.9.2/lib64/libstdc++.so.6.0.20-gdb.py
      line to your configuration file "/var/lib/jenkins/.gdbinit".
      To completely disable this security protection add
      	set auto-load safe-path /
      line to your configuration file "/var/lib/jenkins/.gdbinit".
      For more information about this security protection see the
      "Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
      	info "(gdb)Auto-loading safe path"
      #0  0x00000032650328e5 in raise () from /lib64/libc.so.6
      #1  0x00000032650340c5 in abort () from /lib64/libc.so.6
      #2  0x000000326502ba0e in __assert_fail_base () from /lib64/libc.so.6
      #3  0x000000326502bad0 in __assert_fail () from /lib64/libc.so.6
      #4  0x0000000001b6899f in boost::detail::sp_counted_impl_p<apache::thrift::concurrency::Mutex::impl>::dispose() ()
      #5  0x0000000001b58c7a in boost::detail::sp_counted_impl_pd<apache::thrift::concurrency::Mutex*, boost::checked_array_deleter<apache::thrift::concurrency::Mutex> >::dispose() ()
      #6  0x0000000001b58a89 in boost::shared_array<apache::thrift::concurrency::Mutex>::~shared_array() ()
      #7  0x0000003265035de2 in exit () from /lib64/libc.so.6
      #8  0x000000326501ece4 in __libc_start_main () from /lib64/libc.so.6
      #9  0x000000000081eb25 in _start ()
      

        Activity

        Hide
        dknupp David Knupp added a comment -

        Sailesh Mukil – assigning to you since you've touched this file a bunch recently. If it should go to someone else, please let me know, or reassign as you think appropriate.

        Show
        dknupp David Knupp added a comment - Sailesh Mukil – assigning to you since you've touched this file a bunch recently. If it should go to someone else, please let me know, or reassign as you think appropriate.
        Hide
        dknupp David Knupp added a comment -

        This has been seen in a couple of other places. Raising the priority.

        Show
        dknupp David Knupp added a comment - This has been seen in a couple of other places. Raising the priority.
        Hide
        tarmstrong Tim Armstrong added a comment -

        This is reproducible for me locally - it repeats every few test runs.

        Show
        tarmstrong Tim Armstrong added a comment - This is reproducible for me locally - it repeats every few test runs.
        Hide
        tarmstrong Tim Armstrong added a comment -

        I believe the shared_array being destructed is this one in the thrift codebase in lib/cpp/src/thrift/transport/TSSLSocket.cpp:

        static shared_array<Mutex> mutexes;
        

        We don't stop the statestore services or subscribers in the test so they may be accessing those mutexes when the process exists. I'm guessing the IMPALA-5253 fix caused the regression, although the bug is actually that the destructor is run without tearing down the threads using OpenSSL.

        Show
        tarmstrong Tim Armstrong added a comment - I believe the shared_array being destructed is this one in the thrift codebase in lib/cpp/src/thrift/transport/TSSLSocket.cpp: static shared_array<Mutex> mutexes; We don't stop the statestore services or subscribers in the test so they may be accessing those mutexes when the process exists. I'm guessing the IMPALA-5253 fix caused the regression, although the bug is actually that the destructor is run without tearing down the threads using OpenSSL.
        Hide
        tarmstrong Tim Armstrong added a comment -

        IMPALA-5291: avoid calling global destructors in statestore-test

        The workaround implemented by this patch is to exit the process using
        _exit(), which does not run global destructors.

        The "proper" solution would be either to change Thrift so that the
        destructors are not run unnecessarily on process teardown, or to extend
        the statestore services and clients to allow stopping them. It seems the
        complexity of neither is justified.

        Testing:
        Was able to reproduce the bug by running statestore-test in a loop.
        After the fix it was not reproducible.

        Change-Id: Ic185825cf262311bdc05784ebd541cf53a13cdd6
        Reviewed-on: http://gerrit.cloudera.org:8080/6872
        Reviewed-by: Matthew Jacobs <mj@cloudera.com>
        Tested-by: Impala Public Jenkins

        Show
        tarmstrong Tim Armstrong added a comment - IMPALA-5291 : avoid calling global destructors in statestore-test The workaround implemented by this patch is to exit the process using _exit(), which does not run global destructors. The "proper" solution would be either to change Thrift so that the destructors are not run unnecessarily on process teardown, or to extend the statestore services and clients to allow stopping them. It seems the complexity of neither is justified. Testing: Was able to reproduce the bug by running statestore-test in a loop. After the fix it was not reproducible. Change-Id: Ic185825cf262311bdc05784ebd541cf53a13cdd6 Reviewed-on: http://gerrit.cloudera.org:8080/6872 Reviewed-by: Matthew Jacobs <mj@cloudera.com> Tested-by: Impala Public Jenkins —

          People

          • Assignee:
            tarmstrong Tim Armstrong
            Reporter:
            dknupp David Knupp
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development