Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-9808

libprocess can deadlock on termination (cleanup() vs use() + terminate())

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.9.0
    • None

    Description

      Using the process::loop() together with the common pattern of using libprocess (Process wrapper + dispatching) is prone to causing a deadlock on libprocess termination if the code does not wait for the loop exit before termination.

      The deadlock itself is not directly caused by the process::loop(), though.
      It occurs in a following setup with two processes (let's name them A and B).

      Thread 1 tries to cleanup process A. It locks processes_mutex and hangs here:
      https://github.com/apache/mesos/blob/663bfa68b6ab68f4c28ed6a01ac42ac2ad23ac07/3rdparty/libprocess/src/process.cpp#L3079
      waiting for the process A to have no strong references.

      Thread 2 begins with creating a ProcessReference in ProcessManager::deliver(UPID&) called for process: https://github.com/apache/mesos/blob/663bfa68b6ab68f4c28ed6a01ac42ac2ad23ac07/3rdparty/libprocess/src/process.cpp#L2799

      and ends up waiting for processes_mutex in ProcessManager::terminate() for process B:
      https://github.com/apache/mesos/blob/663bfa68b6ab68f4c28ed6a01ac42ac2ad23ac07/3rdparty/libprocess/src/process.cpp#L3155

      -----------------
      In the observed case, terminate() for process B was triggered by a destructor of a process-wrapping object owned by a libprocess loop executing on A.

      I'm attaching the stacks captured at the deadlock. Stacks of the threads which lock one another are in deadlock_stacks_filtered.txt Note frame #1 in Thread 5 (waiting for all references to expire) and frames #48 and #8 in Thread 19 (creating a reference and waiting for a processes_mutex).

      Attachments

        1. deadlock_stacks_filtered.txt
          54 kB
          Andrei Sekretenko
        2. deadlock_stacks.txt
          266 kB
          Andrei Sekretenko
        3. deadlock_stacks_with_fix.txt
          32 kB
          Andrei Sekretenko

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            bmahler Benjamin Mahler
            asekretenko Andrei Sekretenko
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment