Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-914 (Umbrella) Support graceful decommission of nodemanager
  3. YARN-5566

Client-side NM graceful decom is not triggered when jobs finish

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.8.0
    • 2.8.0, 3.0.0-alpha2
    • nodemanager
    • None
    • Reviewed

    Description

      I was testing the client-side NM graceful decommission and noticed that it was always waiting for the timeout, even if all jobs running on that node (or even the cluster) had already finished.

      For example:

      1. JobA is running with at least one container on NodeA
      2. User runs client-side decom on NodeA at 5:00am with a timeout of 3 hours --> NodeA enters DECOMMISSIONING state
      3. JobA finishes at 6:00am and there are no other jobs running on NodeA
      4. User's client reaches the timeout at 8:00am, and forcibly decommissions NodeA

      NodeA should have decommissioned at 6:00am.

      Attachments

        1. YARN-5566.001.patch
          1 kB
          Robert Kanter
        2. YARN-5566.002.patch
          2 kB
          Robert Kanter
        3. YARN-5566.003.patch
          8 kB
          Robert Kanter
        4. YARN-5566.004.patch
          9 kB
          Robert Kanter
        5. YARN-5566.004.branch-2.8.patch
          14 kB
          Robert Kanter
        6. YARN-5566.004.branch-2.8.addendum.patch
          3 kB
          Robert Kanter

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rkanter Robert Kanter
            rkanter Robert Kanter
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment