Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Invalid
-
0.23.0
-
None
-
None
-
Twitter Mesos Q2 Sprint 6
-
3
Description
Network isolator has multiple instances of the following pattern:
Try<bool> something = ....::create(); if (something.isError()) { ++metrics.something_errors; return Failure("Failed to create something ...") } else if (!icmpVethToEth0.get()) { ++metrics.adding_veth_icmp_filters_already_exist; return Failure("Something already exists"); }
These failures have occurred in operation due to the failure to recover or delete an orphan, causing the slave to remain on line but unable to create new resources. We should convert the second failure message in this pattern to an information message since the final state of the system is the state that we requested.
Attachments
Issue Links
- is related to
-
MESOS-2367 Improve slave resiliency in the face of orphan containers
- Resolved
-
MESOS-2914 Port mapping isolator should cleanup unknown orphan containers after all known orphan containers are recovered during recovery.
- Resolved