Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
We have noticed that when the YarnClient is initialized and used, it is not very resilient when dns or /etc/hosts is modified in the following scenario:
Take for instance the following (and reproducable) sequence of events that can occur on a service that instantiates and uses YarnClient.
- Yarn has rm HA enabled (yarn.resourcemanager.ha.enabled is true) and there are two rms (rm1 and rm2).
- yarn.client.failover-proxy-provider is set to org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider
1) rm2 is currently the active rm
2) /etc/hosts (or dns) is missing host information for rm2
3) A service is started and it initializes the YarnClient at startup.
4) At some point in time after YarnClient is done initializing, /etc/hosts is updated and contains host information for rm2
5) Yarn is queried, for instance calling yarnclient.getApplications()
6) All YarnClient attempts to communicate with rm2 fail with UnknownHostExceptions, even though /etc/hosts now contains host information for it.