Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-914

(Umbrella) Support graceful decommission of nodemanager

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.0.4-alpha
    • Fix Version/s: None
    • Component/s: graceful
    • Labels:
      None

      Description

      When NMs are decommissioned for non-fault reasons (capacity change etc.), it's desirable to minimize the impact to running applications.

      Currently if a NM is decommissioned, all running containers on the NM need to be rescheduled on other NMs. Further more, for finished map tasks, if their map output are not fetched by the reducers of the job, these map tasks will need to be rerun as well.

      We propose to introduce a mechanism to optionally gracefully decommission a node manager.

        Attachments

          Issue Links

          1.
          RM to inform AMs when a container completed due to NM going offline -planned or unplanned Sub-task Resolved Rohith Sharma K S
          2.
          RMNode State Transition Update with DECOMMISSIONING state Sub-task Resolved Junping Du
          3.
          Resource update during NM graceful decommission Sub-task Resolved Brook Zhou
          4.
          Notify AM with containers (on decommissioning node) could be preempted after timeout. Sub-task Open Sunil Govindan
          5.
          New parameter or CLI for decommissioning node gracefully in RMAdmin CLI Sub-task Resolved Devaraj K
          6.
          Automatic and Asynchronous Decommissioning Nodes Status Tracking Sub-task Resolved Daniel Zhi
          7.
          UI changes for decommissioning node Sub-task Resolved Sunil Govindan
          8.
          RMNodeResourceUpdateEvent update from scheduler can lead to race condition Sub-task Resolved Wilfred Spiegelenburg
          9.
          Document graceful decommission CLI and usage Sub-task Resolved Elek, Marton
          10.
          Add -client|server argument for graceful decom Sub-task Resolved Robert Kanter
          11.
          Server-Side NM Graceful Decommissioning with RM HA Sub-task Patch Available Gergely Pollak
          12.
          Server-Side NM Graceful Decommissioning subsequent call behavior Sub-task Open Unassigned
          13.
          Multiple format support (JSON, etc.) for exclude node file in NM graceful decommission with timeout Sub-task Open Unassigned
          14.
          Client-side NM graceful decom is not triggered when jobs finish Sub-task Resolved Robert Kanter
          15.
          Clarify DecommissionType.FORCEFUL comment Sub-task Resolved Vrushali C
          16.
          Document the current known issue with server-side NM graceful decom Sub-task Resolved Robert Kanter
          17.
          Remove XML excludes file format Sub-task Resolved Robert Kanter
          18.
          Better utilize gracefully decommissioning node managers Sub-task Open Karthik Palaniappan
          19.
          DecommissioningNodesWatcher should get lists of running applications on node from RMNode. Sub-task Patch Available Abhishek Modi

            Activity

              People

              • Assignee:
                djp Junping Du
                Reporter:
                vicaya Luke Lu
              • Votes:
                3 Vote for this issue
                Watchers:
                76 Start watching this issue

                Dates

                • Created:
                  Updated: