Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
1.4.0, 1.4.1, 1.4.2, 1.5.0, 1.5.1, 1.6.0, 1.6.1
-
None
Description
After successful launch of a nested container via `LAUNCH_NESTED_CONTAINER_SESSION` in a checker library, it calls waitNestedContainer for the container. Checker library calls `REMOVE_NESTED_CONTAINER` to remove a previous nested container before launching a nested container for a subsequent check. Hence, `REMOVE_NESTED_CONTAINER` call follows `WAIT_NESTED_CONTAINER` to ensure that the nested container has been terminated and can be removed/cleaned up.
In case of failure, the library doesn't call `WAIT_NESTED_CONTAINER`. Despite the failure, the container might be launched and the following attempt to remove the container without call `WAIT_NESTED_CONTAINER` leads to errors like:
W0202 20:03:08.895830 7 checker_process.cpp:503] Received '500 Internal Server Error' (Nested container has not terminated yet) while removing the nested container '2b0c542c-1f5f-42f7-b914-2c1cadb4aeca.da0a7cca-516c-4ec9-b215-b34412b670fa.check-49adc5f1-37a3-4f26-8708-e27d2d6cd125' used for the COMMAND check for task 'node-0-server__e26a82b0-fbab-46a0-a1ea-e7ac6cfa4c91
The checker library should always call `WAIT_NESTED_CONTAINER` before `REMOVE_NESTED_CONTAINER`.
Attachments
Issue Links
- is related to
-
MESOS-9131 Health checks launching nested containers while a container is being destroyed lead to unkillable tasks.
- Resolved