Uploaded image for project: 'REEF (Retired)'
  1. REEF (Retired)
  2. REEF-1248

Identify the scenarios that need to restart evaluators

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • REEF.NET

    Description

      a. Any transit app error should have retry logic inside code. After retry, if it still fails, restart server won’t help.
      b. Any expected app exceptions should be not recoverable
      c. Unexpected app exceptions should be not recoverable

      Resource issue
      a. Evaluator is killed by RM. We should response to this case

      System Error
      a. System issue causing a machine crash
      b. Other system error we encountered in 10 month data testing, what are the exact events received?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              juliaw Julia Wang
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: