Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
None
-
None
Description
This was discovered when simulating frequent slave restarts on a staging cluster.
It appears that when we reconcile tasks from a re-registering slave, we correctly remove the task and update the slave's resourcesInUse. However, there can be an executor associated with this task, and the reconciliation currently does not do anything to reconcile the executor's resources.
We'll need to ensure the reconciliation properly removes executors corresponding to these lost tasks.