Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-914 (Umbrella) Support graceful decommission of nodemanager
  3. YARN-9721

An easy method to exclude a nodemanager from the yarn cluster cleanly

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      If we want to take offline a nodemanager server, nodes.exclude-path
      and "rmadmin -refreshNodes" command are used to decommission the server.
      But this method cannot clean up the node clearly. Nodemanager servers are still in Decommissioned Nodes as the attachment shows.

       

      YARN-4311 enable a removalTimer to clean up the untracked node.
      But the logic of isUntrackedNode method is to restrict. If include-path is not used, no servers can meet the criteria. Using an include file would make a potential risk in maintenance.

      If yarn cluster is installed on cloud, nodemanager servers are created and deleted frequently. We need a way to exclude a nodemanager from the yarn cluster cleanly. Otherwise, the map of rmContext.getInactiveRMNodes() would keep growing, which would cause a memory issue of RM.

      Attachments

        1. decommission nodes.png
          14 kB
          Zac Zhou

        Activity

          People

            Unassigned Unassigned
            yuan_zac Zac Zhou
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: