Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-3363

custom executor's child process intermittently leaks to be a child of slave

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.23.0
    • None
    • None

    Description

      I was testing a custom executor implementation that manages the life cycle of multiple child processes. When the executor is SIGTERM'd it sends a SIGTERM to each child process and then self-terminates.

      In some cases, the child processes do not die, even through the parent process (the custom executor) does. Instead the child procs are re-parented to the slave process where they continue to live on indefinitely.

      My custom executor is written in Go, and I've found a useful Go/Linux-specific setting that allows me to configure a signal to be sent to child procs upon the death of the calling thread in the parent. (see https://golang.org/src/syscall/exec_linux.go?s=6285:6843#1 for details). I've since configured the custom executor to specify that a SIGKILL be sent to all child procs upon termination of the executor (parent) process: child procs are still sent a SIGTERM upon receipt of such by the executor, but the SIGKILL upon executor death now acts as a fallback.

      Since implementing the above work-around I have not been able to reproduce the problem as previously described. This particular syscall is implemented in very few OS's (the Golang hack only supports Linux) so I'm not sure how I'd go about something similar on Windows, OS X, BSD, etc.

      It seems like mesos should take on the responsibility to ensure that when an executor is killed, all of it's child procs are also eventually killed. Given that it's an intermittent and hard to reproduce problem, I'm assuming that mesos does attempt to ensure executor child proc death, but the that the implementation is racy/leaky.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jdef James DeFelice
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: