Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-4663

DeadLocks in ZKRMStateStore

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Invalid
    • Affects Version/s: 2.7.0
    • Fix Version/s: None
    • Component/s: resourcemanager
    • Labels:
      None
    • Target Version/s:

      Description

      Java stack information for the threads listed above:
      ===================================================
      "org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$VerifyActiveStatusThread":
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.doStoreMultiWithRetries(ZKRMStateStore.java:963)
      	- waiting to lock <0x00000000c8470590> (a org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.access$600(ZKRMStateStore.java:92)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$VerifyActiveStatusThread.run(ZKRMStateStore.java:1113)
      "main-EventThread":
      	at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1468)
      	- waiting to lock <0x00000000ce5d0160> (a java.util.LinkedList)
      	at org.apache.zookeeper.ClientCnxn$SendThread.cleanAndNotifyState(ClientCnxn.java:1456)
      	at org.apache.zookeeper.ClientCnxn$SendThread.access$2800(ClientCnxn.java:868)
      	at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1641)
      	- locked <0x00000000ce5c66c0> (a org.apache.zookeeper.ClientCnxn$Packet)
      	at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1622)
      	at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2261)
      	at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2291)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$8.run(ZKRMStateStore.java:1053)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$8.run(ZKRMStateStore.java:1050)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithCheck(ZKRMStateStore.java:1145)
      	- locked <0x00000000c8470590> (a org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore$ZKAction.runWithRetries(ZKRMStateStore.java:1178)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.getChildrenWithRetries(ZKRMStateStore.java:1050)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.loadApplicationAttemptState(ZKRMStateStore.java:606)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.loadRMAppState(ZKRMStateStore.java:595)
      	- locked <0x00000000c8470590> (a org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore)
      	at org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.loadState(ZKRMStateStore.java:464)
      	- locked <0x00000000c8470590> (a org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore)
      	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:625)
      	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
      	- locked <0x00000000c824a9f0> (a java.lang.Object)
      	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1033)
      	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1074)
      	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1070)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1675)
      	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1070)
      	- locked <0x00000000c804f7c0> (a org.apache.hadoop.yarn.server.resourcemanager.ResourceManager)
      	at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:314)
      	- locked <0x00000000c8465e20> (a org.apache.hadoop.yarn.server.resourcemanager.AdminService)
      	at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.becomeActive(EmbeddedElectorService.java:126)
      	at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:832)
      	at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:422)
      	- locked <0x00000000c82e1808> (a org.apache.hadoop.ha.ActiveStandbyElector)
      	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:694)
      	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:566)
      "main-SendThread(160-149-0-9:24002)":
      	at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:775)
      	- waiting to lock <0x00000000ce5c66c0> (a org.apache.zookeeper.ClientCnxn$Packet)
      	at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:815)
      	at org.apache.zookeeper.ClientCnxn.access$2600(ClientCnxn.java:99)
      	at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1469)
      	- locked <0x00000000ce5d0160> (a java.util.LinkedList)
      	at org.apache.zookeeper.ClientCnxn$SendThread.cleanAndNotifyState(ClientCnxn.java:1456)
      	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1385)
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Jobo Bob.zhao
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: