Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Done
-
None
-
None
-
None
Description
Background:
Originally we want `volume/csi` isolator to leverage the existing service manager to launch CSI plugins as standalone containers and currently service manager needs to call the following agent HTTP APIs:
- `GET_CONTAINERS` to get all standalone containers in its `recover` method.
- `KILL_CONTAINER` and `WAIT_CONTAINER` to kill the outdated standalone containers in its `recover` method.
- `LAUNCH_CONTAINER` via the existing ContainerDaemon to launch CSI plugin as standalone container when its `getEndpoint` method is called.
The problem with the above design is, `volume/csi` isolator may need to clean up orphan container during agent recovery which is triggered by containerizer (see here for details), to clean up an orphan container which is using a CSI volume, `volume/csi` isolator needs to instantiate and recover the service manager and get CSI plugin’s endpoint from it (i.e., service manager’s `getEndpoint` method will be called by `volume/csi` isolator during agent recovery. And as I mentioned above service manager’s `getEndpoint` may need to call `LAUNCH_CONTAINER` to launch CSI plugin as standalone container, since agent is still in recovering state, such agent HTTP call will be just rejected by agent. So we have to instantiate and recover service manager after agent recovery is done, but in `volume/csi` isolator we do not have such information (i.e. the signal that agent recovery is done).
Solution
We need to implement a new component (like `CSIVolumeManager` or a better name?) in Mesos agent which is responsible for launching CSI plugins as standalone containers (via the existing service manager) and making CSI gRPC calls (via the existing volume manager).
- We can instantiate this new component in the `main` method of agent and pass it to both containerizer and agent (i.e. it will be a member of the `Slave` object), and containerizer will in turn pass it to the `volume/csi` isolator.
- Since this new component relies on service manager which will call agent HTTP APIs, we need to pass agent URL to it, like `process::http::URL(scheme, agentIP, agentPort, agentLibprocessId + "/api/v1")`, see here for an example.
- When agent registers/reregisters with master (`Slave::registered` and `Slave::reregistered`), we should call this new component’s `start` method (see here and here as examples) which will scan the directory `--csi_plugin_config_dir` and create the `service manager - volume manager` pair for each CSI plugin loaded from that directory.
- For the `volume/csi` isolator, it needs to call this new component’s `publishVolume` and `unpublishVolume` methods in its `prepare` and `cleanup` method.
In the case of clean up orphan containers during agent recovery, `volume/csi` isolator will just call this new component’s `unpublishVolume` method as usual, and it is this new component’s responsibility to only make the actual CSI gRPC call after agent recovery is done and agent has registered with master (e.g., when this new component’s start method is called).