[YARN-5566] Client-side NM graceful decom is not triggered when jobs finish - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.8.0
Fix Version/s: 2.8.0, 3.0.0-alpha2
Component/s: nodemanager
Labels:
None

Target Version/s:

2.8.0
Hadoop Flags:

Reviewed

Description

I was testing the client-side NM graceful decommission and noticed that it was always waiting for the timeout, even if all jobs running on that node (or even the cluster) had already finished.

For example:

JobA is running with at least one container on NodeA
User runs client-side decom on NodeA at 5:00am with a timeout of 3 hours --> NodeA enters DECOMMISSIONING state
JobA finishes at 6:00am and there are no other jobs running on NodeA
User's client reaches the timeout at 8:00am, and forcibly decommissions NodeA

NodeA should have decommissioned at 6:00am.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

YARN-5566.004.patch
30/Aug/16 19:14
9 kB
Robert Kanter
YARN-5566.004.branch-2.8.patch
06/Sep/16 20:33
14 kB
Robert Kanter
YARN-5566.004.branch-2.8.addendum.patch
08/Sep/16 22:31
3 kB
Robert Kanter
YARN-5566.003.patch
28/Aug/16 22:56
8 kB
Robert Kanter
YARN-5566.002.patch
28/Aug/16 18:32
2 kB
Robert Kanter
YARN-5566.001.patch
26/Aug/16 08:08
1 kB
Robert Kanter

Issue Links

breaks

YARN-5655 TestContainerManagerSecurity#testNMTokens is asserting

Resolved

Activity

People

Assignee:: Robert Kanter

Reporter:: Robert Kanter

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 26/Aug/16 07:55

Updated:: 25/Oct/19 20:26

Resolved:: 09/Sep/16 04:22