Uploaded image for project: 'Qpid'
  1. Qpid
  2. QPID-7695

[Java Broker, BDB HA] Indefinite hang when new node joins existing group but existing node is unresponsive

    Details

      Description

      When adding a new node to an existing group, internally Qpid uses com.sleepycat.je.rep.util.DbPing#DbPing() to establish initial contact with the node and perform some preliminary checks. If this node is somehow unresponsive, Qpid (the Broker's Confif Thread) hangs indefinitely and is unrecoverable. BDB JE 5.0.104 is in use.

      The Broker Config thread stack trace looks like this:

       java.lang.Thread.State: RUNNABLE
      	  at sun.nio.ch.FileDispatcherImpl.read0(FileDispatcherImpl.java:-1)
      	  at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
      	  at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
      	  at sun.nio.ch.IOUtil.read(IOUtil.java:197)
      	  at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
      	  - locked <0x168d> (a java.lang.Object)
      	  at com.sleepycat.je.rep.utilint.ServiceDispatcher.doServiceHandshake(ServiceDispatcher.java:325)
      	  at com.sleepycat.je.rep.util.DbPing.getNodeState(DbPing.java:194)
      	  at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.getRemoteNodeState(ReplicatedEnvironmentFacade.java:1807)
      	  at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.connectToHelperNodeAndCheckPermittedHosts(ReplicatedEnvironmentFacade.java:1846)
      	  at org.apache.qpid.server.virtualhostnode.berkeleydb.BDBHAVirtualHostNodeImpl.getPermittedNodesFromHelper(BDBHAVirtualHostNodeImpl.java:566)
      	  at org.apache.qpid.server.virtualhostnode.berkeleydb.BDBHAVirtualHostNodeImpl.validateOnCreate(BDBHAVirtualHostNodeImpl.java:546)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$6.execute(AbstractConfiguredObject.java:878)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$6.execute(AbstractConfiguredObject.java:865)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:636)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:629)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl$TaskLoggingWrapper.execute(TaskExecutorImpl.java:240)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl.submitWrappedTask(TaskExecutorImpl.java:157)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl.submit(TaskExecutorImpl.java:145)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject.doOnConfigThread(AbstractConfiguredObject.java:628)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject.createAsync(AbstractConfiguredObject.java:864)
      	  at org.apache.qpid.server.model.AbstractConfiguredObjectTypeFactory.createAsync(AbstractConfiguredObjectTypeFactory.java:75)
      	  at org.apache.qpid.server.model.ConfiguredObjectFactoryImpl.createAsync(ConfiguredObjectFactoryImpl.java:145)
      	  at org.apache.qpid.server.model.BrokerImpl.createVirtualHostNodeAsync(BrokerImpl.java:605)
      	  at org.apache.qpid.server.model.BrokerImpl.addChildAsync(BrokerImpl.java:660)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$17.execute(AbstractConfiguredObject.java:2094)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$17.execute(AbstractConfiguredObject.java:2089)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:636)
      	  at org.apache.qpid.server.model.AbstractConfiguredObject$2.execute(AbstractConfiguredObject.java:629)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl$TaskLoggingWrapper.execute(TaskExecutorImpl.java:240)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl$CallableWrapper$1.run(TaskExecutorImpl.java:312)
      	  at java.security.AccessController.doPrivileged(AccessController.java:-1)
      	  at javax.security.auth.Subject.doAs(Subject.java:360)
      	  at org.apache.qpid.server.configuration.updater.TaskExecutorImpl$CallableWrapper.call(TaskExecutorImpl.java:305)
      	  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	  at java.lang.Thread.run(Thread.java:745)
      
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              k-wall Keith Wall
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: