Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
Mesosphere Sprint 73, Mesosphere Sprint 74, Mesosphere Sprint 75, Mesosphere Sprint 76, Mesosphere Sprint 77
-
8
Description
Observed this on internal Mesosphere CI.
../../src/tests/cluster.cpp:662: Failure Value of: containers->empty() Actual: false Expected: true Failed to destroy containers: { test }
Steps to reproduce
- Add ::sleep(1); before removing "test" cgroup
- recompile
- run `GLOG_v=2 sudo GLOG_v=2 ./src/mesos-tests --gtest_filter=LinuxCapabilitiesIsolatorFlagsTest.ROOT_IsolatorFlags --gtest_break_on_failure --gtest_repeat=10 --verbose`
Race description
While recovery is in progress for the first slave, calling `StartSlave()` leads to calling `slave::Containerizer::create()` to create a containerizer. An attempt to create a mesos c'zer, leads to calling `cgroups::prepare`. Finally, we get to the point, where we try to create a "test" container. So, the recovery process for the second slave might detect this "test" container as an orphaned container.
Thus, there is the race between recovery process for the first slave and an attempt to create a c'zer for the second agent.
Attachments
Attachments
Issue Links
- is part of
-
MESOS-7506 Multiple tests leave orphan containers.
- Resolved