XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.13
    • None

    Description

      Currently, DriverRestartHandler only informs about the StartTime, which is not very helpful. The way the rest of the restart mechanism is set up, users can also be confused in the following ways:

      1. When EvaluatorFailedHandler is called, the user does not know for sure whether it is called on a failed Evaluator from a previous Driver instance or from the current Driver instance without keeping a list of AllocatedEvaluators.
      2. When DriverRestartCompletedHandler is called, the user cannot easily find out whether it is called due to a timeout expiration or due to the fact that all Evaluators have already either failed or reported back.

      The proposal is to return 2 sets of Evaluator IDs, those that have failed on restart, and those that are expected to report back in the DriverRestartHandler, along with StartTime. Although it's still not as explicit for both 1 and 2, this lets users know what to expect.

      There will also be items to make 1 and 2 more explicit.
      For 1, an item will be filed to create a special DriverRestartEvaluatorFailedHandler. See REEF-688. For 2, an item will be filed to extend the DriverRestartCompletedHandler to pass back an object that includes the set of failed Evaluator IDs. See REEF-691.

      Attachments

        Issue Links

          Activity

            People

              afchung90 Andrew Chung
              afchung90 Andrew Chung
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: