Uploaded image for project: 'Qpid'
  1. Qpid
  2. QPID-7695

[Java Broker, BDB HA] Indefinite hang when new node joins existing group but existing node is unresponsive

    XMLWordPrintableJSON

Details

    Description

      When adding a new node to an existing group, internally Qpid uses com.sleepycat.je.rep.util.DbPing#DbPing() to establish initial contact with the node and perform some preliminary checks. If this node is somehow unresponsive, Qpid (the Broker's Confif Thread) hangs indefinitely and is unrecoverable. BDB JE 5.0.104 is in use.

      The Broker Config thread stack trace looks like this:

       java.lang.Thread.State: RUNNABLE
      	  at sun.nio.ch.FileDispatcherImpl.read0(FileDispatcherImpl.java:-1)
      	  at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
      	  at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
      	  at sun.nio.ch.IOUtil.read(IOUtil.java:197)
      	  at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
      	  - locked <0x168d> (a java.lang.Object)
      	  at com.sleepycat.je.rep.utilint.ServiceDispatcher.doServiceHandshake(ServiceDispatcher.java:325)
      	  at com.sleepycat.je.rep.util.DbPing.getNodeState(DbPing.java:194)
      	  at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.getRemoteNodeState(ReplicatedEnvironmentFacade.java:1807)
      	  at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.connectToHelperNodeAndCheckPermittedHosts(ReplicatedEnvironmentFacade.java:1846)
      	  at org.apache.qpid.server.virtualhostnode.berkeleydb.BDBHAVirtualHostNodeImpl.getPermittedNodesFromHelper(BDBHAVirtualHostNodeImpl.java:566)
      	  at org.apache.qpid.server.virtualhostnode.berkeleydb.BDBHAVirtualHostNodeImpl.validateOnCreate(BDBHAVirtualHostNodeImpl.java:546)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$6.execute(AbstractConfiguredObject.java:878)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$6.execute(AbstractConfiguredObject.java:865)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:636)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:629)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl$TaskLoggingWrapper.execute(TaskExecutorImpl.java:240)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl.submitWrappedTask(TaskExecutorImpl.java:157)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl.submit(TaskExecutorImpl.java:145)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject.doOnConfigThread(AbstractConfiguredObject.java:628)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject.createAsync(AbstractConfiguredObject.java:864)
      	  at org.apache.qpid.server.model.AbstractConfiguredObjectTypeFactory.createAsync(AbstractConfiguredObjectTypeFactory.java:75)
      	  at org.apache.qpid.server.model.ConfiguredObjectFactoryImpl.createAsync(ConfiguredObjectFactoryImpl.java:145)
      	  at org.apache.qpid.server.model.BrokerImpl.createVirtualHostNodeAsync(BrokerImpl.java:605)
      	  at org.apache.qpid.server.model.BrokerImpl.addChildAsync(BrokerImpl.java:660)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$17.execute(AbstractConfiguredObject.java:2094)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$17.execute(AbstractConfiguredObject.java:2089)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:636)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:629)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl$TaskLoggingWrapper.execute(TaskExecutorImpl.java:240)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl$CallableWrapper$1.run(TaskExecutorImpl.java:312)
      	  at java.security.AccessController.doPrivileged(AccessController.java:-1)
      	  at javax.security.auth.Subject.doAs(Subject.java:360)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl$CallableWrapper.call(TaskExecutorImpl.java:305)
      	  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	  at java.lang.Thread.run(Thread.java:745)
      
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            kwall Keith Wall
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: