Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
Right now in LocalityMulticastAMRMProxyPolicy, whenever we cannot resolve the resource name (node or rack), we always route the request to home sub-cluster. However, home sub-cluster might not be always be ready to use (timed out YARN-8581) or enabled (by AMRMProxyPolicy weights). It might also be overwhelmed by the requests if sub-cluster resolver has some issue. In this Jira, we are changing it to pick a random active and enabled sub-cluster for resource request we cannot resolve.