Details
Description
In the fix to MESOS-8488, we reap the Docker container process directly in Docker executor, and it will wait for `docker run` to return for at most 3 seconds. However, in some cases, the `docker run` command will indeed need more than 3 seconds to return, e.g., the Docker container uses an external rexray volume (see the attached task json as an example), for such container, there will be about 5 seconds between container process exits and the `docker run` returns (I suspect Docker daemon was doing some stuff related to rexray volume during this time), so we will reap this container, and send a TASK_FAILED.
Attachments
Attachments
Issue Links
- is caused by
-
MESOS-8488 Docker bug can cause unkillable tasks.
- Resolved