Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-6072

RM unable to start in secure mode

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.8.0, 3.0.0-alpha2
    • 2.8.0, 2.9.0, 3.0.0-alpha2
    • resourcemanager
    • None

    Description

      Resource manager is unable to start in secure mode

      2017-01-08 14:27:29,917 INFO org.apache.hadoop.conf.Configuration: found resource hadoop-policy.xml at file:/opt/hadoop/release/hadoop-3.0.0-alpha2-SNAPSHOT/etc/hadoop/hadoop-policy.xml
      2017-01-08 14:27:29,918 INFO org.apache.hadoop.yarn.server.resourcemanager.AdminService: Refresh All
      java.lang.NullPointerException
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:569)
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshServiceAcls(AdminService.java:552)
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:707)
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
              at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
              at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
              at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
              at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
              at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
      2017-01-08 14:27:29,919 ERROR org.apache.hadoop.yarn.server.resourcemanager.AdminService: RefreshAll failed so firing fatal event
      org.apache.hadoop.ha.ServiceFailedException
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
              at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
              at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
              at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
              at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
              at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
      2017-01-08 14:27:29,920 INFO org.apache.hadoop.ipc.Server: Starting Socket Reader #1 for port 8033
      2017-01-08 14:27:29,948 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election
      org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
              at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
              at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:888)
              at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
              at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
              at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
      Caused by: org.apache.hadoop.ha.ServiceFailedException: Error on refreshAll during transition to Active
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:311)
              at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:142)
              ... 4 more
      Caused by: org.apache.hadoop.ha.ServiceFailedException
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:712)
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:302)
              ... 5 more
      
      

      ResourceManager services are added in following order

      1. EmbeddedElector
      2. AdminService

      During resource manager service start() .EmbeddedElector starts first and invokes AdminService#refreshAll() but AdminService#serviceStart() happens after ActiveStandbyElectorBasedElectorService service start is complete. So AdminService#server will be null which causes AdminService#refreshAll() to fail

            if (getConfig().getBoolean(
                CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION,
                false)) {
              refreshServiceAcls();
            }
      

      Attachments

        1. YARN-6072.03.patch
          3 kB
          Ajith S
        2. YARN-6072.03.branch-2.8.patch
          3 kB
          Ajith S
        3. YARN-6072.02.patch
          3 kB
          Ajith S
        4. YARN-6072.01.patch
          3 kB
          Ajith S
        5. YARN-6072.01.branch-2.patch
          3 kB
          Ajith S
        6. YARN-6072.01.branch-2.8.patch
          3 kB
          Ajith S
        7. hadoop-secureuser-resourcemanager-vm1.log
          564 kB
          Bibin Chundatt

        Issue Links

          Activity

            People

              ajithshetty Ajith S
              bibinchundatt Bibin Chundatt
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: