Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-3804

Both RM are on standBy state when kerberos user not in yarn.admin.acl

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.7.0
    • Fix Version/s: 2.8.0, 2.7.1, 3.0.0-alpha1
    • Component/s: resourcemanager
    • Labels:
      None
    • Environment:

      Suse 11 Sp3, 2 RM, Secure

      Description

      Steps to reproduce
      ================
      1. Configure cluster in secure mode
      2. On RM Configure yarn.admin.acl=dsperf
      3. Configure in arn.resourcemanager.principal=yarn
      4. Start Both RM

      Both RM will be in Standby forever

      2015-06-15 12:20:21,556 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn     OPERATION=refreshAdminAcls      TARGET=AdminService     RESULT=FAILURE  DESCRIPTION=Unauthorized userPERMISSIONS=
      2015-06-15 12:20:21,556 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election
      org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active
              at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:128)
              at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:824)
              at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:420)
              at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:645)
              at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:518)
      Caused by: org.apache.hadoop.ha.ServiceFailedException: Can not execute refreshAdminAcls
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:297)
              at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:126)
              ... 4 more
      Caused by: org.apache.hadoop.yarn.exceptions.YarnException: org.apache.hadoop.security.AccessControlException: User yarn doesn't have permission to call 'refreshAdminAcls'
              at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAcls(AdminService.java:230)
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAdminAcls(AdminService.java:465)
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:295)
              ... 5 more
      Caused by: org.apache.hadoop.security.AccessControlException: User yarn doesn't have permission to call 'refreshAdminAcls'
              at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:182)
              at org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:148)
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAccess(AdminService.java:223)
              at org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAcls(AdminService.java:228)
              ... 7 more
      

      Analysis

      On each RM attempt to switch to Active refreshACl is called and acl permission not available for the user
      Infinite retry for the same switch to Active and always false returned from
      ActiveStandbyElector#becomeActive()

      Expected

      RM should get shutdown event after few retry or even at first attempt
      Since at runtime user from which it retries for refreshacl can never be updated.

      States from commands

      ./yarn rmadmin -getServiceState rm2
      standby
      ./yarn rmadmin -getServiceState rm1
      standby

      ./yarn rmadmin -checkHealth rm1
      echo $? = 0
      ./yarn rmadmin -checkHealth rm2
      echo $? = 0

        Attachments

        1. YARN-3804.01.patch
          2 kB
          Varun Saxena
        2. YARN-3804.02.patch
          2 kB
          Varun Saxena
        3. YARN-3804.03.patch
          3 kB
          Varun Saxena
        4. YARN-3804.04.patch
          6 kB
          Varun Saxena
        5. YARN-3804.05.patch
          7 kB
          Varun Saxena
        6. YARN-3804.branch-2.7.patch
          6 kB
          Xuan Gong

          Activity

            People

            • Assignee:
              varun_saxena Varun Saxena
              Reporter:
              bibinchundatt Bibin A Chundatt
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: