Uploaded image for project: 'REEF'
  1. REEF
  2. REEF-364 A REEF application for utilizing volatile resources
  3. REEF-501

Distinguish different types of FailedEvaluator in Vortex

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Vortex
    • Labels:
      None

      Description

      Currently Vortex considers all failures the same, via FailedEvaluatorHandler. We should handle different types of failures differently.

      Type 1: Resource preemption
      We react based on a configured policy. (e.g. re-request infinitely) If needed we can even add a new event handler to REEF Driver named PreemptedEvaluatorHandler just for this type(a separate JIRA issue outside of the Vortex umbrella JIRA).

      Type 2: Internal Vortex code failure
      Can happen nondeterministically and even result in an infinite resource release+request. In such case, we should probably shut down the Driver immediately for the ease of debugging and to prevent it from interefereing with other jobs in the cluster.

      Type 3: Other types of failures
      If the failure is caused by issues like OOM then we also treat such case differently.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                johnyangk John Yang
                Reporter:
                johnyangk John Yang
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: