Uploaded image for project: 'REEF (Retired)'
  1. REEF (Retired)
  2. REEF-345

Complete implementation for YARN AM HA

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • YARN, HDInsight

    Description

      The implementation logic for AM HA on YARN is incomplete.

      Attachments

        1.
        Refactor the Driver restart code to be modular Sub-task Resolved Andrew Chung
        2.
        Add configuration in allowing other ways to preserve running evaluators aside from DFS Sub-task Resolved Andrew Chung
        3.
        Add concept of application attempt to driver startup Sub-task Resolved Andrew Chung
        4.
        Expose configuration to preserve containers across application on YARN Sub-task Resolved Andrew Chung
        5.
        Make Driver call DriverRestartActiveContextHandler instead of have the Evaluator decide when to call it Sub-task Resolved Andrew Chung
        6.
        Move Driver restart configuration options to a separate configuration Sub-task Resolved Andrew Chung
        7.
        DriverRestartHandler is not bound correctly to its C# equivalent Sub-task Resolved Andrew Chung
        8.
        Notifying evaluator restart failure entails calling AllocatedEvaluatorHandler Sub-task Resolved Andrew Chung
        9.
        Determine restarts on YARN by using a runtime environment variable Sub-task Resolved Andrew Chung
        10.
        Tighten previous evaluator ID checks by using entire set of evaluator IDs Sub-task Resolved Andrew Chung
        11.
        Add a configurable timeout for driver to recover evaluators on restart Sub-task Resolved Andrew Chung
        12.
        Move restart functions from DriverStatusManager to DriverRestartManager Sub-task Resolved Andrew Chung
        13.
        Evaluators that are kept alive are not able to re-register with the driver Sub-task Resolved Andrew Chung
        14.
        Add example for restart on YARN Sub-task Resolved Andrew Chung
        15.
        Create a default implementation for DriverRestartManager Sub-task Resolved Andrew Chung
        16.
        Add a YARN .NET driver restart example Sub-task Resolved Andrew Chung
        17.
        Add NodeDescriptor, number of cores, and memory to construct complete EvaluatorManager for recovered evaluator Sub-task Closed Andrew Chung
        18.
        Application should not close evaluators on driver failure if restart is enabled. Sub-task Resolved Andrew Chung
        19.
        Generate class hierarchy on DriverRestart Sub-task Resolved Andrew Chung
        20.
        Keep state of previous evaluators with a state machine Sub-task Resolved Andrew Chung
        21.
        Enable creation of EvaluatorManager on restarted evaluators Sub-task Resolved Andrew Chung
        22.
        Change the way restart is represented in DriverRestartManager Sub-task Resolved Andrew Chung
        23.
        Implement driver restart completion logic Sub-task Resolved Andrew Chung
        24.
        Add container to the set of containers on restart such that YarnContainerManager will know how to release it Sub-task Resolved Andrew Chung
        25.
        Make the getRackName() function in YARN sharable Sub-task Resolved Andrew Chung
        26.
        Add an object that encapsulates information needed to recover an evaluator on restart Sub-task Resolved Andrew Chung
        27.
        Evaluator removed twice from evaluator log if an evaluator failed on restart Sub-task Resolved Andrew Chung
        28.
        Add default constructor to DefaultRackNameFormatter Sub-task Resolved Markus Weimer
        29.
        Change DriverRestartState enum to all caps Sub-task Resolved Andrew Chung
        30.
        Remove usage of get/setIsFromPreviousDriver() Sub-task Resolved Andrew Chung
        31.
        Add csproj for Org.Apache.REEF.Examples.DriverRestart Sub-task Resolved Andrew Chung
        32.
        Restructure handler for DriverRestart in Java Sub-task Resolved Andrew Chung
        33.
        Create a DriverRestartCompleted event in Java Sub-task Resolved Andrew Chung
        34.
        Add DriverRestartEvaluatorFailedHandler Sub-task Resolved Andrew Chung
        35.
        Invoke onDriverRestartContextActive in ContextRepresenters for Evaluators with Active Contexts Sub-task Resolved Andrew Chung
        36.
        Implement driver restart logic for Java evaluator Sub-task Open Unassigned
        37.
        Restructure handler for DriverRestart in .NET Sub-task Resolved Andrew Chung
        38.
        Create a DriverRestartCompleted event in .NET Sub-task Resolved Andrew Chung
        39.
        Add a test for Evaluator-Preserving Driver restarts Sub-task Open Unassigned
        40.
        Improve DriverRestart Example Sub-task Resolved Andrew Chung
        41.
        A Driver Connection State event handler in Evaluator for C# Sub-task Resolved Andrew Chung
        42.
        Split DriverInfo and ReefServicesInfo and move certain avsc files into reef-common Sub-task Closed Unassigned
        43.
        Support multiple RMs and other Hadoop distributions Sub-task Resolved Andrew Chung
        44.
        Use a two-file approach for DFSEvaluatorLogOverwriteWriter Sub-task Resolved Andrew Chung
        45.
        Improve efficiency of DFSEvaluatorLogOverwriteWriter Sub-task Open Unassigned
        46.
        Allow transition from ALLOCATED to RUNNING in EvaluatorState Sub-task Resolved Mariia Mykhailova
        47.
        Don't close streams in DFSEvaluatorLogOverwriteReaderWriter before flushing them Sub-task Resolved Mariia Mykhailova

        Activity

          People

            afchung90 Andrew Chung
            afchung90 Andrew Chung
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: