Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-6933

Executor does not respect grace period

    Details

    • Type: Bug
    • Status: Accepted
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: executor
    • Labels:
      None

      Description

      Mesos Command Executor try to support grace period with escalate but unfortunately it does not work. It launches command by wrapping it in sh -c this cause process tree to look like this

      Received killTask
      Shutting down
      Sending SIGTERM to process tree at pid 18
      Sent SIGTERM to the following process trees:
      [ 
      -+- 18 sh -c cd offer-i18n-0.1.24 && LD_PRELOAD=../librealresources.so ./bin/offer-i18n -e prod -p $PORT0 
       \--- 19 command...
      ]
      Command terminated with signal Terminated (pid: 18)
      

      This cause sh to immediately close and so executor, while wrapped command might need some more time to finish. Finally, executor thinks command executed gracefully so it won't escalate to SIGKILL.

      This cause leaks when POSIX containerizer is used because if command ignores SIGTERM it will be attached to initialize and never get killed. Using pid/namespace only masks the problem because hanging process is captured before it can gracefully shutdown.

      Fix for this is to sent SIGTERM only to sh children. sh will exit when all children processes finish. If not they will be killed by escalation to SIGKILL.

      All versions from 0.20 are affected.

      This test should pass src/tests/command_executor_tests.cpp:342
      Mailing list thread

        Issue Links

          Activity

          Hide
          klueska Kevin Klues added a comment -

          I assume youa re referring to the "Command executor", not the "Default Exector" (the default executor is new as of the 1.1 release and deals with launching task groups).

          Jie Yu Vinod KoneBenjamin Mahler Who is the best person to take a look at this bug?

          Show
          klueska Kevin Klues added a comment - I assume youa re referring to the "Command executor", not the "Default Exector" (the default executor is new as of the 1.1 release and deals with launching task groups). Jie Yu Vinod Kone Benjamin Mahler Who is the best person to take a look at this bug?
          Hide
          haosdent@gmail.com haosdent added a comment -

          Kevin KluesTomasz Janiszewski This is sh problem rather than Mesos bug, because /bin/sh doesn't forward signals to any child processes.

          Docker has similar problem when you try to exit gracefully if you use sh to launch commands, refer to https://www.ctl.io/developers/blog/post/gracefully-stopping-docker-containers/ for the details.

          So the correct way to implement exit gracefully in Docker, Mesos and other applications is to avoid use sh. More precisely, user should set CommandInfo.shell to false and use exec form to launch tasks if they would like to make task exit gracefully. Make sense?

          Show
          haosdent@gmail.com haosdent added a comment - Kevin Klues Tomasz Janiszewski This is sh problem rather than Mesos bug, because /bin/sh doesn't forward signals to any child processes. Docker has similar problem when you try to exit gracefully if you use sh to launch commands, refer to https://www.ctl.io/developers/blog/post/gracefully-stopping-docker-containers/ for the details. So the correct way to implement exit gracefully in Docker, Mesos and other applications is to avoid use sh . More precisely, user should set CommandInfo.shell to false and use exec form to launch tasks if they would like to make task exit gracefully. Make sense?
          Hide
          janisz Tomasz Janiszewski added a comment -

          I think /bin/sh doesn't forward signals to any child processes is not a problem, killTree deliver signal to every sh children. The problem is sh terminates fast and children could need some time to gracefully shutdown.

          Show
          janisz Tomasz Janiszewski added a comment - I think /bin/sh doesn't forward signals to any child processes is not a problem, killTree deliver signal to every sh children. The problem is sh terminates fast and children could need some time to gracefully shutdown.
          Hide
          janisz Tomasz Janiszewski added a comment -

          I think it's related to following Benjamin Hindman comment:

          // TODO(benh): Allow excluding the root pid from stopping, killing,
          // and continuing so as to provide a means for expressing "kill all of
          // my children". This is non-trivial because of the current
          // implementation.
          

          https://github.com/apache/mesos/blob/22d3f56ce10cf61b6a1f06614bd63e0943a8b769/3rdparty/stout/include/stout/os/posix/killtree.hpp#L54-L57

          Show
          janisz Tomasz Janiszewski added a comment - I think it's related to following Benjamin Hindman comment: // TODO(benh): Allow excluding the root pid from stopping, killing, // and continuing so as to provide a means for expressing "kill all of // my children". This is non-trivial because of the current // implementation. https://github.com/apache/mesos/blob/22d3f56ce10cf61b6a1f06614bd63e0943a8b769/3rdparty/stout/include/stout/os/posix/killtree.hpp#L54-L57
          Hide
          alexr Alexander Rukletsov added a comment -

          Tomasz Janiszewski, this is—unfortunately—a known issue that've been here for a while (linked the original ticket). Surprisingly we haven't seen a lot of requests to fix it (do folks avoid wrapping their tasks in sh?) and never got to work on this.

          Do you want to suggest a patch? I'll be happy to shepherd.

          Show
          alexr Alexander Rukletsov added a comment - Tomasz Janiszewski , this is—unfortunately—a known issue that've been here for a while (linked the original ticket). Surprisingly we haven't seen a lot of requests to fix it (do folks avoid wrapping their tasks in sh ?) and never got to work on this. Do you want to suggest a patch? I'll be happy to shepherd.
          Hide
          xds2000 Deshi Xiao added a comment -

          how to reproduce this bug? let me understand where can do a patch.

          Show
          xds2000 Deshi Xiao added a comment - how to reproduce this bug? let me understand where can do a patch.
          Hide
          janisz Tomasz Janiszewski added a comment -

          This error is quite easy to reproduce.

          1. Run Mesos cluster with default configuration (you can use ./build/bin/mesos-local.sh). Do not enable any isolators especially naespace/pid isolator because it can cover this bug.
          2. Create script that works in infinite loop and ignore signals

          cat > /tmp/script.sh <<EOF
          #!/bin/sh
          trap "echo SIGNAL" HUP INT TERM
          while : ; do
            date >> /tmp/date.txt
            sleep 1
          done
          EOF
          

          3. Start created script on Mesos and kill it after couple of seconds working. You can use any framework e.g., {{ mesos-execute --kill_after=10secs --master=localhost:5050 --command="/tmp/script.sh" --name="graceful-kill-test"}}
          4. Monitor logs. You can see there that script is signaled with SIGTERM and the shell has excited but script is still running and producing output.

          The easiest solution will be to signal tree and then wait for all processes in this tree to exit, not only the root.

          Show
          janisz Tomasz Janiszewski added a comment - This error is quite easy to reproduce. 1. Run Mesos cluster with default configuration (you can use ./build/bin/mesos-local.sh ). Do not enable any isolators especially naespace/pid isolator because it can cover this bug. 2. Create script that works in infinite loop and ignore signals cat > /tmp/script.sh <<EOF #!/bin/sh trap "echo SIGNAL" HUP INT TERM while : ; do date >> /tmp/date.txt sleep 1 done EOF 3. Start created script on Mesos and kill it after couple of seconds working. You can use any framework e.g., {{ mesos-execute --kill_after=10secs --master=localhost:5050 --command="/tmp/script.sh" --name="graceful-kill-test"}} 4. Monitor logs. You can see there that script is signaled with SIGTERM and the shell has excited but script is still running and producing output. The easiest solution will be to signal tree and then wait for all processes in this tree to exit, not only the root.
          Hide
          xds2000 Deshi Xiao added a comment -

          Tomasz Janiszewski i have reproduce the step. and not sure to check if " the shell has excited but script is still running and producing output." is happened. so cloud you please give a patient comments in the mesos log will prefer way to let me understand. sorry for the request.

          Show
          xds2000 Deshi Xiao added a comment - Tomasz Janiszewski i have reproduce the step. and not sure to check if " the shell has excited but script is still running and producing output." is happened. so cloud you please give a patient comments in the mesos log will prefer way to let me understand. sorry for the request.
          Hide
          janisz Tomasz Janiszewski added a comment -

          The log I mention it /tmp/date.txt. You should be able to see new entries after task is killed.

          Show
          janisz Tomasz Janiszewski added a comment - The log I mention it /tmp/date.txt . You should be able to see new entries after task is killed.
          Hide
          xds2000 Deshi Xiao added a comment -

          please check the screenshot. Tomasz Janiszewski

          Show
          xds2000 Deshi Xiao added a comment - please check the screenshot. Tomasz Janiszewski
          Hide
          janisz Tomasz Janiszewski added a comment -

          Deshi Xiao On my setup it works differently. I'm on Mesos 1.3 and when I follow steps described above I finish with a state where task is killed but it's still running. In logs you provide I see Mesos executor somehow determined that not all proceses has exited and sent KILL signal to them. In my case it ends on SIGTERM

          Sent SIGTERM to the following process trees:
          [
          -+- 18776 sh -c /tmp/script.sh
           \-+- 18790 /bin/sh /tmp/script.sh
             \--- 18832 sleep 1
          ]
          Scheduling escalation to SIGKILL in 3secs from now
          Terminated
          SIGNAL
          Command terminated with signal Terminated (pid: 18776)
          
          Show
          janisz Tomasz Janiszewski added a comment - Deshi Xiao On my setup it works differently. I'm on Mesos 1.3 and when I follow steps described above I finish with a state where task is killed but it's still running. In logs you provide I see Mesos executor somehow determined that not all proceses has exited and sent KILL signal to them. In my case it ends on SIGTERM Sent SIGTERM to the following process trees: [ -+- 18776 sh -c /tmp/script.sh \-+- 18790 /bin/sh /tmp/script.sh \--- 18832 sleep 1 ] Scheduling escalation to SIGKILL in 3secs from now Terminated SIGNAL Command terminated with signal Terminated (pid: 18776)
          Hide
          xds2000 Deshi Xiao added a comment -

          Tomasz Janiszewski yes, i build on upstream mesos code base. it is 1.4.

          Show
          xds2000 Deshi Xiao added a comment - Tomasz Janiszewski yes, i build on upstream mesos code base. it is 1.4.
          Show
          janisz Tomasz Janiszewski added a comment - I can reproduce it on latest master https://github.com/apache/mesos/commit/400d3002d4aa82cbae4b55bced608e95225176e4
          Hide
          xds2000 Deshi Xiao added a comment -

          Tomasz Janiszewski do you can write a testing to cover it? i have no clues to check where code to start the fixing.

          Show
          xds2000 Deshi Xiao added a comment - Tomasz Janiszewski do you can write a testing to cover it? i have no clues to check where code to start the fixing.
          Hide
          xds2000 Deshi Xiao added a comment -

          Alexander Rukletsov hi guy, do you have cycles to shepherd me. i want to fix it in my try.

          Show
          xds2000 Deshi Xiao added a comment - Alexander Rukletsov hi guy, do you have cycles to shepherd me. i want to fix it in my try.

            People

            • Assignee:
              Unassigned
              Reporter:
              janisz Tomasz Janiszewski
            • Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:

                Development