Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
2.8.0, 3.0.0-alpha2
-
None
-
Reviewed
Description
Resource manager is unable to start in secure mode
2017-01-08 14:27:29,917 INFO org.apache.hadoop.conf.Configuration: found resource hadoop-policy.xml at file:/opt/hadoop/release/hadoop-3.0.0-alpha2-SNAPSHOT/etc/hadoop/hadoop-policy.xml
2017-01-08 14:27:29,918 INFO org.apache.hadoop.yarn.server.resourcemanager.AdminService: Refresh All
java.lang.NullPointerException
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:569)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:552)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:707)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2017-01-08 14:27:29,919 ERROR org.apache.hadoop.yarn.server.resourcemanager.AdminService: RefreshAll failed so firing fatal event
org.apache.hadoop.ha.ServiceFailedException
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2017-01-08 14:27:29,920 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8033
2017-01-08 14:27:29,948 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election
org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
Caused by: org.apache.hadoop.ha.ServiceFailedException: Error on refreshAll during transition to Active
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:311)
at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
... 4 more
Caused by: org.apache.hadoop.ha.ServiceFailedException
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
... 5 more
ResourceManager services are added in following order
- EmbeddedElector
- AdminService
During resource manager service start() .EmbeddedElector starts first and invokes AdminService#refreshAll() but AdminService#serviceStart() happens after ActiveStandbyElectorBasedElectorService service start is complete. So AdminService#server will be null which causes AdminService#refreshAll() to fail
if (getConfig().getBoolean( CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION, false)) { refreshServiceAcls(); }