Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-8239

LIFO semaphore does not decommission correctly.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Accepted
    • Major
    • Resolution: Unresolved
    • None
    • None
    • libprocess
    • None

    Description

      When building with the DecomissionableLastInFirstOutFixedSizeSemaphore, it seems that libprocess can get stuck during finalization:

      ../configure CXX=clang++ CC=clang --disable-python --disable-java --enable-ssl --enable-libevent --enable-lock-free-run-queue --enable-lock-free-event-queue --enable-last-in-first-out-fixed-size-semaphore
      
      Thread 2 (Thread 0x7f939ffff700 (LWP 39226)):
      #0  0x00007f94641d3a0b in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=0, futex=0x7f945001edc0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
      #1  do_futex_wait (sem=sem@entry=0x7f945001edc0, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:223
      #2  0x00007f94641d3a9f in __new_sem_wait_slow (sem=0x7f945001edc0, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:292
      #3  0x00007f94641d3b3b in __new_sem_wait (sem=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:28
      #4  0x0000000000a7225c in KernelSemaphore::wait() () at ../../../3rdparty/libprocess/src/semaphore.hpp:115
      #5  0x0000000000a72069 in wait () at ../../../3rdparty/libprocess/src/semaphore.hpp:371
      #6  0x0000000000a3afbc in process::RunQueue::wait() () at ../../../3rdparty/libprocess/src/run_queue.hpp:147
      #7  0x0000000000a1f8f5 in dequeue () at ../../../3rdparty/libprocess/src/process.cpp:3647
      #8  0x0000000000a287e1 in operator() () at ../../../3rdparty/libprocess/src/process.cpp:2859
      #9  0x0000000000a286d5 in void std::_Bind_simple<process::ProcessManager::init_threads()::$_9 ()>::_M_invoke<>(std::_Index_tuple<>) ()
          at /opt/rh/devtoolset-4/root/usr/lib/gcc/x86_64-redhat-linux/5.3.1/../../../../include/c++/5.3.1/functional:1530
      #10 0x0000000000a286a5 in std::_Bind_simple<process::ProcessManager::init_threads()::$_9 ()>::operator()() ()
          at /opt/rh/devtoolset-4/root/usr/lib/gcc/x86_64-redhat-linux/5.3.1/../../../../include/c++/5.3.1/functional:1520
      #11 0x0000000000a28599 in std::thread::_Impl<std::_Bind_simple<process::ProcessManager::init_threads()::$_9 ()> >::_M_run() ()
          at /opt/rh/devtoolset-4/root/usr/lib/gcc/x86_64-redhat-linux/5.3.1/../../../../include/c++/5.3.1/thread:115
      #12 0x0000000000bae180 in execute_native_thread_routine ()
      #13 0x00007f94641cde25 in start_thread (arg=0x7f939ffff700) at pthread_create.c:308
      #14 0x00007f94632cf34d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
      
      Thread 1 (Thread 0x7f9466bd68c0 (LWP 37342)):
      #0  0x00007f94641cef57 in pthread_join (threadid=140272021272320, thread_return=0x0) at pthread_join.c:92
      #1  0x00007f9463b67077 in __gthread_join (__value_ptr=0x0, __threadid=<optimized out>)
          at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/x86_64-redhat-linux/bits/gthr-default.h:668
      #2  std::thread::join (this=0x2ca3740) at ../../../../../libstdc++-v3/src/c++11/thread.cc:107
      #3  0x0000000000a0e212 in process::ProcessManager::finalize() () at ../../../3rdparty/libprocess/src/process.cpp:2797
      #4  0x0000000000a0cfc3 in process::finalize(bool) () at ../../../3rdparty/libprocess/src/process.cpp:1407
      #5  0x0000000000a0ce3d in process::reinitialize(Option<std::string> const&, Option<std::string> const&, Option<std::string> const&) () at ../../../3rdparty/libprocess/src/process.cpp:1092
      #6  0x00000000005f16ed in HTTPTest::TearDownTestCase() () at ../../../3rdparty/libprocess/src/tests/http_tests.cpp:203
      #7  0x00000000008be4b3 in testing::TestCase::RunTearDownTestCase() () at googletest-release-1.8.0/googletest/include/gtest/gtest.h:891
      #8  0x00000000008d542a in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::TestCase, void>(testing::TestCase*, void (testing::TestCase::*)(), char const*) ()
          at googletest-release-1.8.0/googletest/src/gtest.cc:2402
      #9  0x00000000008be271 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::TestCase, void>(testing::TestCase*, void (testing::TestCase::*)(), char const*) ()
          at googletest-release-1.8.0/googletest/src/gtest.cc:2438
      #10 0x000000000089ee61 in testing::TestCase::Run() () at googletest-release-1.8.0/googletest/src/gtest.cc:2779
      #11 0x00000000008a6361 in testing::internal::UnitTestImpl::RunAllTests() () at googletest-release-1.8.0/googletest/src/gtest.cc:4649
      #12 0x00000000008d6f4a in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) () at googletest-release-1.8.0/googletest/src/gtest.cc:2402
      #13 0x00000000008bf511 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) () at googletest-release-1.8.0/googletest/src/gtest.cc:2438
      #14 0x00000000008a6033 in testing::UnitTest::Run() () at googletest-release-1.8.0/googletest/src/gtest.cc:4257
      #15 0x0000000000694f31 in RUN_ALL_TESTS() () at ../googletest-release-1.8.0/googletest/include/gtest/gtest.h:2233
      #16 0x0000000000693e6b in main () at ../../../3rdparty/libprocess/src/tests/main.cpp:111
      

      Looks like there is a bug in the decomission logic.

      Attachments

        Activity

          People

            benjaminhindman Benjamin Hindman
            bmahler Benjamin Mahler
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: