Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-914 (Umbrella) Support graceful decommission of nodemanager
  3. YARN-9721

An easy method to exclude a nodemanager from the yarn cluster cleanly

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      If we want to take offline a nodemanager server, nodes.exclude-path
      and "rmadmin -refreshNodes" command are used to decommission the server.
      But this method cannot clean up the node clearly. Nodemanager servers are still in Decommissioned Nodes as the attachment shows.

       

      YARN-4311 enable a removalTimer to clean up the untracked node.
      But the logic of isUntrackedNode method is to restrict. If include-path is not used, no servers can meet the criteria. Using an include file would make a potential risk in maintenance.

      If yarn cluster is installed on cloud, nodemanager servers are created and deleted frequently. We need a way to exclude a nodemanager from the yarn cluster cleanly. Otherwise, the map of rmContext.getInactiveRMNodes() would keep growing, which would cause a memory issue of RM.

        Attachments

        1. decommission nodes.png
          14 kB
          Zac Zhou

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              yuan_zac Zac Zhou
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: