[MESOS-9151] Container stuck at ISOLATING due to FD leak - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.4.3, 1.5.2, 1.6.2, 1.7.0
Component/s: containerization
Labels:

Target Version/s:

1.4.0, 1.5.0, 1.6.0, 1.7.0
Epic Link:
Unified Container
Sprint:
Mesosphere Sprint 2018-26, Mesosphere Sprint 2018-27, Mesosphere Sprint 2018-28
Story Points:
8

Description

When containers are launching on a single agent at scale, one container stuck at ISOLATING could occasionally happen. And this container becomes un-destroyable due to containerizer destroy always wait for isolate() finish to continue.

We add more logging to debug this issue:

Aug 10 17:23:28 ip-10-0-1-129.us-west-2.compute.internal mesos-agent[2974]: I0810 17:23:28.050068  2995 collect.hpp:271] $$$$: AwaitProcess waited invoked for ProcessBase ID: __await__(26651); futures size: 3; future: Ready; future index: 2; ready count: 1
Aug 10 17:23:28 ip-10-0-1-129.us-west-2.compute.internal mesos-agent[2974]: I0810 17:23:28.414436  2998 collect.hpp:271] $$$$: AwaitProcess waited invoked for ProcessBase ID: __await__(26651); futures size: 3; future: Ready; future index: 0; ready count: 2

which shows that the await() in CNI::attach() stuck at the second future (io::read() for stdout).

By looking at the df of this stdout:

Aug 10 17:23:27 ip-10-0-1-129.us-west-2.compute.internal mesos-agent[2974]: I0810 17:23:27.657501  2995 cni.cpp:1287] !!!!: Start to await for plugin '/opt/mesosphere/active/mesos/libexec/mesos/mesos-cni-port-mapper' to finish for container 1c8abf4c-f71a-4704-9a73-1ab0dd709c62 with pid '16644'; stdout fd: 1781; stderr fd: 1800

We found

core@ip-10-0-1-129 ~ $ ps aux | grep mesos-agent
core      1674  0.0  0.0   6704   864 pts/0    S+   20:00   0:00 grep --colour=auto mesos-agent
root      2974 16.4  2.5 1211096 414048 ?      Ssl  17:02  29:11 /opt/mesosphere/packages/mesos--61265af3be37861f26b657c1f9800293b86a0374/bin/mesos-agent
core@ip-10-0-1-129 ~ $ sudo ls -al /proc/2974/fd/ | grep 1781
l-wx------. 1 root root 64 Aug 10 19:38 1781 -> /var/lib/mesos/slave/meta/slaves/d3089315-8e34-40b4-b1f7-0ac6a624d7db-S0/frameworks/d3089315-8e34-40b4-b1f7-0ac6a624d7db-0000/executors/test2.d820326d-9cc1-11e8-9809-ee15da5c8980/runs/38e9270d-ebda-4758-ad96-40c5b84bffdc/tasks/test2.d820326d-9cc1-11e8-9809-ee15da5c8980/task.updates

core@ip-10-0-1-129 ~ $ ps aux | grep 27981
core      2201  0.0  0.0   6704   884 pts/0    S+   20:06   0:00 grep --colour=auto 27981
root     27981  0.0  0.0   1516     4 ?        Ss   17:25   0:00 sleep 10000
core@ip-10-0-1-129 ~ $ cat /proc/s^C       
core@ip-10-0-1-129 ~ $ sudo -s
ip-10-0-1-129 core # ls -al /proc/27981/fd | grep 275230
lr-x------. 1 root root 64 Aug 10 20:05 1781 -> pipe:[275230]
l-wx------. 1 root root 64 Aug 10 20:05 1787 -> pipe:[275230]

core@ip-10-0-1-129 ~ $ sudo ls -al /proc/2974/fd/ | grep pipe
lr-x------. 1 root root 64 Aug 10 17:02 11 -> pipe:[49380]
l-wx------. 1 root root 64 Aug 10 17:02 14 -> pipe:[49380]
lr-x------. 1 root root 64 Aug 10 17:02 17 -> pipe:[48909]
lr-x------. 1 root root 64 Aug 10 19:38 1708 -> pipe:[275089]
l-wx------. 1 root root 64 Aug 10 19:38 1755 -> pipe:[275089]
lr-x------. 1 root root 64 Aug 10 19:38 1787 -> pipe:[275230]
l-wx------. 1 root root 64 Aug 10 17:02 19 -> pipe:[48909]

pipe 275230 is held by the agent process and the sleep process at the same time!

The reason why the leak is possible is because we don't use `pipe2` to create a pipe with `O_CLOEXEC` in subprocess:
https://github.com/apache/mesos/blob/1.5.x/3rdparty/libprocess/src/subprocess_posix.cpp#L61

Although we do set cloexec on those fds later:
https://github.com/apache/mesos/blob/1.5.x/3rdparty/libprocess/src/subprocess.cpp#L366-L373

There is a race where a fork happens after `pipe()` call, but before cloexec is called later. This is more likely on a busy system (this explains why it's not hard to repo the issue when launching a lot of containers on a single box).

Attachments

Issue Links

depends upon

MESOS-8128 Make os::pipe file descriptors O_CLOEXEC.

Resolved

MESOS-9152 Close all file descriptors except whitelist_fds in posix/subprocess.

Resolved

is duplicated by

MESOS-9162 Unkillable pod container stuck in ISOLATING

Resolved

Activity

People

Assignee:: Gilbert Song

Reporter:: Gilbert Song

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 10/Aug/18 22:41

Updated:: 10/Oct/18 23:58

Resolved:: 13/Sep/18 19:37