Details
Description
Reproduce steps:
1) Run mesos-execute to launch a task which joins a CNI network net1 and has checkpoint enabled:
$ cat task_cni.json { "name": "test1", "task_id": {"value" : "test1"}, "agent_id": {"value" : ""}, "resources": [ {"name": "cpus", "type": "SCALAR", "scalar": {"value": 0.1}}, {"name": "mem", "type": "SCALAR", "scalar": {"value": 32}} ], "command": { "value": "sleep 1000" }, "container": { "type": "MESOS", "network_infos": [ { "name": "net1" } ] } } $ mesos-execute --master=192.168.56.5:5050 --task=file:///home/stack/workspace/config/task_cni.json --checkpoint
2) After task is in the TASK_RUNNING state, restart the agent process, and then in the agent log, we will see the container is destroyed.
... I0622 17:30:00.792310 7426 containerizer.cpp:1024] Recovering isolators I0622 17:30:00.798740 7430 cni.cpp:437] Removing unknown orphaned container faf69105-e76f-49c7-8e56-964c2f882cff ... I0622 17:30:01.025600 7433 cni.cpp:1546] Unmounted the network namespace handle '/run/mesos/isolators/network/cni/faf69105-e76f-49c7-8e56-964c2f882cff/ns' for container faf69105-e76f-49c7-8e56-964c2f882cff I0622 17:30:01.026211 7433 cni.cpp:1557] Removed the container directory '/run/mesos/isolators/network/cni/faf69105-e76f-49c7-8e56-964c2f882cff' I0622 17:30:02.935093 7429 slave.cpp:5215] Cleaning up un-reregistered executors I0622 17:30:02.935221 7429 slave.cpp:5233] Killing un-reregistered executor 'test1' of framework dc2b3db0-953c-47a4-8fd4-f6d040e9d10e-0002 at executor(1)@192.168.11.7:33719 I0622 17:30:02.935900 7429 slave.cpp:7311] Finished recovery I0622 17:30:02.937409 7427 containerizer.cpp:2405] Destroying container faf69105-e76f-49c7-8e56-964c2f882cff in RUNNING state
And mesos-execute will receive a TASK_GONE for the task:
$ mesos-execute --master=192.168.56.5:5050 --task=file:///home/stack/workspace/config/task_cni.json --checkpoint I0622 17:29:50.538630 7246 scheduler.cpp:189] Version: 1.7.0 I0622 17:29:50.548589 7261 scheduler.cpp:355] Using default 'basic' HTTP authenticatee I0622 17:29:50.550348 7263 scheduler.cpp:538] New master detected at master@192.168.56.5:5050 Subscribed with ID dc2b3db0-953c-47a4-8fd4-f6d040e9d10e-0002 Submitted task 'test1' to agent 'dc2b3db0-953c-47a4-8fd4-f6d040e9d10e-S0' Received status update TASK_STARTING for task 'test1' source: SOURCE_EXECUTOR Received status update TASK_RUNNING for task 'test1' source: SOURCE_EXECUTOR Received status update TASK_GONE for task 'test1' message: 'Executor did not reregister within 2secs' source: SOURCE_AGENT reason: REASON_EXECUTOR_REREGISTRATION_TIMEOUT
Attachments
Attachments
Issue Links
- links to