Uploaded image for project: 'Qpid'
  1. Qpid
  2. QPID-6999

[Java Broker, BDBStore, HA] In replicated environment JE transactions aborted on committing can cause Broker shutdown

    XMLWordPrintableJSON

Details

    Description

      JE Transaction aborted in the middle of commit in Replicated Environment can cause abrupt Broker shutdown

      ########################################################################
      #
      # Unhandled Exception java.lang.IllegalStateException: Transaction Id -42 has been closed. in Thread IO-/127.0.0.1:59532
      #
      # Exiting
      #
      ########################################################################
      java.lang.IllegalStateException: Transaction Id -42 has been closed.
       at com.sleepycat.je.Transaction.checkOpen(Transaction.java:894)
       at com.sleepycat.je.Transaction.doCommit(Transaction.java:592)
       at com.sleepycat.je.Transaction.commit(Transaction.java:410)
       at org.apache.qpid.server.store.berkeleydb.replication.ReplicatedEnvironmentFacade.commitAsync(ReplicatedEnvironmentFacade.java:302)
       at org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore.commitTranAsyncImpl(AbstractBDBMessageStore.java:794)
       at org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore.access$1200(AbstractBDBMessageStore.java:74)
       at org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore$BDBTransaction.commitTranAsync(AbstractBDBMessageStore.java:1364)
       at org.apache.qpid.server.txn.LocalTransaction.commitAsync(LocalTransaction.java:399)
       at org.apache.qpid.server.protocol.v0_8.AMQChannel.commit(AMQChannel.java:1220)
       at org.apache.qpid.server.protocol.v0_8.AMQChannel.receiveTxCommit(AMQChannel.java:3622)
       at org.apache.qpid.codec.ServerDecoder.processMethod(ServerDecoder.java:228)
       at org.apache.qpid.codec.AMQDecoder.processFrame(AMQDecoder.java:191)
       at org.apache.qpid.server.protocol.v0_8.BrokerDecoder.doProcessFrame(BrokerDecoder.java:114)
       at org.apache.qpid.server.protocol.v0_8.BrokerDecoder.access$000(BrokerDecoder.java:36)
       at org.apache.qpid.server.protocol.v0_8.BrokerDecoder$1.run(BrokerDecoder.java:78)
       at org.apache.qpid.server.protocol.v0_8.BrokerDecoder$1.run(BrokerDecoder.java:74)
       at java.security.AccessController.doPrivileged(Native Method)
       at org.apache.qpid.server.protocol.v0_8.BrokerDecoder.processFrame(BrokerDecoder.java:73)
       at org.apache.qpid.codec.AMQDecoder.processInput(AMQDecoder.java:173)
       at org.apache.qpid.codec.AMQDecoder.decode(AMQDecoder.java:114)
       at org.apache.qpid.codec.ServerDecoder.decodeBuffer(ServerDecoder.java:43)
       at org.apache.qpid.server.protocol.v0_8.AMQPConnection_0_8$1.run(AMQPConnection_0_8.java:266)
       at org.apache.qpid.server.protocol.v0_8.AMQPConnection_0_8$1.run(AMQPConnection_0_8.java:258)
       at java.security.AccessController.doPrivileged(Native Method)
       at org.apache.qpid.server.protocol.v0_8.AMQPConnection_0_8.received(AMQPConnection_0_8.java:257)
       at org.apache.qpid.server.transport.MultiVersionProtocolEngine.received(MultiVersionProtocolEngine.java:142)
       at org.apache.qpid.server.transport.NonBlockingConnection.processAmqpData(NonBlockingConnection.java:547)
       at org.apache.qpid.server.transport.NonBlockingConnectionPlainDelegate.processData(NonBlockingConnectionPlainDelegate.java:58)
       at org.apache.qpid.server.transport.NonBlockingConnection.doRead(NonBlockingConnection.java:446)
       at org.apache.qpid.server.transport.NonBlockingConnection.doWork(NonBlockingConnection.java:253)
       at org.apache.qpid.server.transport.NetworkConnectionScheduler.processConnection(NetworkConnectionScheduler.java:108)
       at org.apache.qpid.server.transport.SelectorThread$ConnectionProcessor.processConnection(SelectorThread.java:499)
       at org.apache.qpid.server.transport.SelectorThread$SelectionTask.performSelect(SelectorThread.java:337)
       at org.apache.qpid.server.transport.SelectorThread$SelectionTask.run(SelectorThread.java:86)
       at org.apache.qpid.server.transport.SelectorThread.run(SelectorThread.java:457)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)
      

      In my tests syncing data to disk in commit thread failed (due to majority lost) causing abort of pending commit jobs. IllegalStateException was reported IO-Threads waiting for commit finish.

      try-catch blocks in ReplicatedEnvironmentFacade#commit and ReplicatedEnvironmentFacade#commitAsync do not handle IllegalStateException on environment restart as in the rest of the code.

      Cathcing and handling all RuntimeExceptions should fix the issue

      Attachments

        Activity

          People

            Unassigned Unassigned
            orudyy Alex Rudyy
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: