Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
2.2.0
-
None
Description
When the driver failover_timeout was always set to zero, we relied on the Mesos master to detect the disconnected driver and tear down the framework. When failover_timeout is nonzero, we have to make sure that the driver framework is torn down in all cases. Some cases require an explicit teardown are:
- When a driver job submission is killed by the user
- In --supervise mode, when a driver fails
Note: the driver and executors do stop running. The only issue is the the framework shows up as "Inactive" rather than "Completed" without the teardown, for a period of failover_timeout seconds.