Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Later
    • None
    • None
    • Runtime / Coordination
    • None

    Description

      1. for standalone mode, LocalDispatcher watch JobMaster
      LocalDispatcher detect the failure of JobMaster, recover jobGraph and Libraries from persistent storage, spawn a new JobManager
      new JobMaster compete for leadership, save address to zookeeper storage
      new JobMaster registers at ResourceManager
      new JobMaster recover Execution of its job (execution graph) from latest completed checkpoint
      2. for yarn mode, YarnApplicationMasterRunner create a ProcessReaper of JobMaster
      ProcessReaper monitor JobMaster, kill JVM upon JobMaster termination
      Yarn will create a new AppMaster which contains a new JobManager, JobGraph and Libraries are retrieved as startup artifacts
      new JobMaster compete for leadership, save address to zookeeper storage
      new JobMaster registers at ResourceManager
      new JobMaster recover Execution of its job (execution graph) from latest completed checkpoint

      Attachments

        Activity

          People

            Unassigned Unassigned
            jingzhang Jing Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: