Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
0.24.0, 0.25.0, 0.26.0, 0.27.0
-
Mesosphere Sprint 30
-
2
Description
A persistent volume can be orphaned when:
- A framework registers with checkpointing enabled.
- The framework starts a task + a persistent volume.
- The agent exits. The task continues to run.
- Something wipes the agent's meta directory. This removes the checkpointed framework info from the agent.
- The agent comes back and recovers. The framework for the task is not found, so the task is considered orphaned now.
The agent currently does not unmount the persistent volume, saying (with GLOG_v=1)
I0229 23:55:42.078940 5635 linux.cpp:711] Ignoring cleanup request for unknown container: a35189d3-85d5-4d02-b568-67f675b6dc97
Test implemented here: https://reviews.apache.org/r/44122/