Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-21204

Yarn stopped by itself after start. HA run

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.5.1
    • 2.5.2
    • None
    • None

    Description

      From RM logs :

      2017-06-07 14:23:19,191 FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1240)) - Error starting ResourceManager
      org.apache.hadoop.service.ServiceStateException: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election
              at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
              at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
              at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:152)
              at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
              at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:281)
              at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
              at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1236)
      Caused by: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election
              at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:351)
              at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:103)
              at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
              ... 7 more
      Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /yarn-leader-election
      

      The problem is that disabling security changes zk ACL for resource manager as part of AMBARI-19331. After the recent change in HDFS-11403, RM checks znode version and fails if it's different than expected.
      The correct fix could be to remove znode during security disabling and do not break election znode consistency by manually changing ACL to all. RM should create it with proper ACL.

      Attachments

        1. AMBARI-21204_3.patch
          10 kB
          Dmytro Sen

        Issue Links

          Activity

            People

              dsen Dmytro Sen
              dsen Dmytro Sen
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: