Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.4.0
-
None
Description
Currently inactive nodes which have been decommissioned/shutdown/lost for a while(specified expiration time defined via yarn.resourcemanager.node-removal-untracked.timeout-ms, 60 seconds by default) and not exist in both include and exclude files can be marked as untracked nodes and can be removed from RM state (YARN-4311). It's very useful when auto-scaling is enabled in elastic cloud environment, which can avoid unlimited increase of inactive nodes (mostly are decommissioned nodes).
But this only works when the include path is configured, mismatched for most of our cloud environments without configured white list of nodes, which can lead to easily control for the auto-scaling of nodes without further security requirements.
So I propose to supportĀ marking inactive node as untracked without configured include path, to be compatible with the former versions, we can add a switch config for this.
Any thoughts/suggestions/feedbacks are welcome!