Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Q3 Sprint 2
-
3
Description
A slave may crash while we are installing/removing filters. The slave recovery for the network isolator should tolerate those partially installed filters. Also, we want to avoid leaking a filter on host eth0 and host lo.
The current code cannot tolerate that, thus may cause the following error:
Failed to perform recovery: Collect failed: Failed to recover container d409a100-2afb-497c-864f-fe3002cf65d9 with pid 50405: No ephemeral ports found To remedy this do as follows: Step 1: rm -f /var/lib/mesos/meta/slaves/latest This ensures slave doesn't recover old live executors. Step 2: Restart the slave.