Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Currently, DriverRestartHandler only informs about the StartTime, which is not very helpful. The way the rest of the restart mechanism is set up, users can also be confused in the following ways:
- When EvaluatorFailedHandler is called, the user does not know for sure whether it is called on a failed Evaluator from a previous Driver instance or from the current Driver instance without keeping a list of AllocatedEvaluators.
- When DriverRestartCompletedHandler is called, the user cannot easily find out whether it is called due to a timeout expiration or due to the fact that all Evaluators have already either failed or reported back.
The proposal is to return 2 sets of Evaluator IDs, those that have failed on restart, and those that are expected to report back in the DriverRestartHandler, along with StartTime. Although it's still not as explicit for both 1 and 2, this lets users know what to expect.
There will also be items to make 1 and 2 more explicit.
For 1, an item will be filed to create a special DriverRestartEvaluatorFailedHandler. See REEF-688. For 2, an item will be filed to extend the DriverRestartCompletedHandler to pass back an object that includes the set of failed Evaluator IDs. See REEF-691.
Attachments
Issue Links
- blocks
-
REEF-689 Restructure handler for DriverRestart in .NET
- Resolved
- links to