Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
2.5.1
-
None
Description
Yarn service check is failing after moving the resource manager to a new host. This issue is occuring only on wire encrypted cluster. I performed the same operation on non wire encrypted cluster and service check is working fine.
Steps to reproduce :
1) Move resource manager to a new host.
2) Run the service check.
3) Service check is failing with below exception :
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 181, in <module> ServiceCheck().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 329, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 131, in service_check active_rm_webapp_address = self.get_active_rm_webapp_address() File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py", line 177, in get_active_rm_webapp_address raise Fail('Resource Manager state is not available. Failed to determine the active Resource Manager web application address from {0}'.format(','.join(rm_webapp_addresses))); resource_management.core.exceptions.Fail: Resource Manager state is not available. Failed to determine the active Resource Manager web application address from ctr-e133-1493418528701-64577-01-000002.hwx.site:8090
Even after moving the resource manager to new host, its still referring to the older resource manager host.
Workaround : Property 'yarn.resourcemanager.webapp.https.address' was still referring to older resource manager host. I have manually updated this property to new host and restarted yarn. After this service check is going through fine.