Uploaded image for project: 'ActiveMQ Artemis'
  1. ActiveMQ Artemis
  2. ARTEMIS-480

[Artemis Testsuite] BridgeReconnectTest.testDeliveringCountOnBridgeConnectionFailure fails due to racing condition

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.1.0, 1.2.0, 1.3.0
    • 1.3.0
    • Broker
    • None

    Description

      java.lang.AssertionError: Delivering count of a source queue should be zero on connection failure expected:<0> but was:<1>
      	at org.junit.Assert.fail(Assert.java:88)
      	at org.junit.Assert.failNotEquals(Assert.java:743)
      	at org.junit.Assert.assertEquals(Assert.java:118)
      	at org.junit.Assert.assertEquals(Assert.java:555)
      	at org.apache.activemq.artemis.tests.integration.cluster.bridge.BridgeReconnectTest.testDeliveringCountOnBridgeConnectionFailure(BridgeReconnectTest.java:688)
      
      18:25:43,722 WARN  [org.apache.activemq.artemis.core.server] AMQ222094: Bridge unable to send message Reference[22]:NON-RELIABLE:ServerMessage[messageID=22,durable=false,userID=null,priority=4, bodySize=79, timestamp=Tue Feb 02 18:25:43 EST 2016,expiration=0, durable=false, address=testAddress,properties=TypedProperties[propkey=18,_AMQ_BRIDGE_DUP=[47A6 779A CA04 11E5 9C91 A169 FCCB 5522 0000 0000 0000 0016)]]@1861263739, will try again once bridge reconnects: ActiveMQObjectClosedException[errorType=OBJECT_CLOSED message=AMQ119018: Producer is closed]
      	at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.checkClosed(ClientProducerImpl.java:298) [artemis-core-client-1.1.0.wildfly-012.jar:]
      	at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.send(ClientProducerImpl.java:122) [artemis-core-client-1.1.0.wildfly-012.jar:]
      	at org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.deliverStandardMessage(BridgeImpl.java:698) [artemis-server-1.1.0.wildfly-012.jar:]
      	at org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.handle(BridgeImpl.java:574) [artemis-server-1.1.0.wildfly-012.jar:]
      	at org.apache.activemq.artemis.core.server.impl.QueueImpl.handle(QueueImpl.java:2410) [artemis-server-1.1.0.wildfly-012.jar:]
      	at org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(QueueImpl.java:1813) [artemis-server-1.1.0.wildfly-012.jar:]
      	at org.apache.activemq.artemis.core.server.impl.QueueImpl.access$1400(QueueImpl.java:97) [artemis-server-1.1.0.wildfly-012.jar:]
      	at org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunner.run(QueueImpl.java:2581) [artemis-server-1.1.0.wildfly-012.jar:]
      	at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:100) [artemis-core-client-1.1.0.wildfly-012.jar:]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153) [rt.jar:1.8.0-internal]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [rt.jar:1.8.0-internal]
      	at java.lang.Thread.run(Thread.java:785) [vm.jar:1.8.0-internal]
      

      I've investigated this issue and I found the race condition which causes mentioned fail. Problem lies in [1]. When bridge detects some problem, it calls connectionFailed method which call for every message in refs the Queue.cancel(ref, timeBase). Queue.cancel decreases deliveryCount for canceled message. However before this step, we remove reference on actual message from refs on line 20, so for this message the deliveryCount is not decreased. This is correct behavior, because for this message we return HandleStatus.BUSY. I think that problem is in QueueImpl#deliver method. If bridge returns HandleStatus.BUSY we should decrease deliveryCount. So I think that instead of [2], there should be [3].

      [1]

      private BridgeImpl#HandleStatus deliverStandardMessage(SimpleString dest, final MessageReference ref, ServerMessage message) {
            // if we failover during send then there is a chance that the
            // that this will throw a disconnect, we need to remove the message
            // from the acks so it will get resent, duplicate detection will cope
            // with any messages resent
      
            if (ActiveMQServerLogger.LOGGER.isTraceEnabled()) {
               ActiveMQServerLogger.LOGGER.trace("going to send message: " + message + " from " + this.getQueue());
            }
      
            try {
               producer.send(dest, message);
            }
            catch (final ActiveMQException e) {
               ActiveMQServerLogger.LOGGER.bridgeUnableToSendMessage(e, ref);
      
               synchronized (refs) {
                  // We remove this reference as we are returning busy which means the reference will never leave the Queue.
                  // because of this we have to remove the reference here
                  refs.remove(message.getMessageID());
               }
      
               connectionFailed(e, false);
      
               return HandleStatus.BUSY;
            }
      
            return HandleStatus.HANDLED;
         }
      

      [2]

      else if (status == HandleStatus.BUSY) {
          holder.iter.repeat();
      
           noDelivery++;
      }
      

      [3]

      else if (status == HandleStatus.BUSY) {
          decDelivering();
          holder.iter.repeat();
      
           noDelivery++;
      }
      

      Steps to reproduce:

      1. cd tests
      2. while true; do mvn -Dtest=BridgeReconnectTest#testDeliveringCountOnBridgeConnectionFailure -Ptests -DfailIfNoTests=false test; done

      Attachments

        Activity

          People

            Unassigned Unassigned
            iweiss Ingo Weiss
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: