Details
Description
Here's how to reproduce this issue:
- Start a task using the Docker containerizer (the same will probably happen with the command executor).
- Stop the corresponding Mesos agent while the task is running.
- Change the executor's checkpointed forked pid, which is located in the meta directory, e.g., /var/lib/mesos/slave/meta/slaves/latest/frameworks/19faf6e0-3917-48ab-8b8e-97ec4f9ed41e-0001/executors/foo.13faee90-b5f0-11e7-8032-e607d2b4348c/runs/latest/pids/forked.pid. I used pid 2, which is normally used by kthreadd.
- Reboot the host
Attachments
Issue Links
- is related to
-
MESOS-6223 Allow agents to re-register post a host reboot
- Resolved
- relates to
-
MESOS-9501 Mesos executor fails to terminate and gets stuck after agent host reboot.
- Resolved
-
MESOS-9672 Docker containerizer should ignore pids of executors that do not pass the connection check.
- Resolved