Uploaded image for project: 'ActiveMQ Artemis'
  1. ActiveMQ Artemis
  2. ARTEMIS-3264

Core to AMQP conversion error causes client disconnect

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 2.17.0
    • 2.27.0
    • AMQP, Broker
    • None
    • Embedded Apache Artemis 2.17.0
      Windows Server 2016 Standard (10.0.14393)
      Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
      Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)

    Description

      We are deploying a mesh of embedded brokers and per default use core bridges to replicate data between different broker instances / topics.

      The clients that actually consume messages are connected using AMQP (QPID, AMQP .Net Lite)

      Recently we encountered a situation where the broker could not deliver a message to a (Java QPID) client because the internal conversion from Core to AMQP failed (see attached log file).

      This had the effect that the client got disconnected and did not receive any messages anymore at all (it was stuck in a JMS receive call and obviously was not informed about disconnect - not sure if this is a QPID/Proton issue, but even after restart the client was not able to connect anymore to the server! We had to restart the server to be able to connect again!)

      We are currently working around this issue by using AMQP (i.e. JMS) as the only client side protocol to avoid that Core-AMQP conversion happens in the first place.

      However, I'm wondering if the way the broker deals with such errors is a good idea - it disconnects the client and keeps the message in the queue, so even after reconnect the delivery fails again with the same Exception!

      Looking at the call stack (ending up in QueueImpl:3800) this kind of error is handled in a very generic way - the handler method does not distinguish between different types of Exceptions and knows nothing about the reason why delivery failed, however it still defaults to disconnecting the corresponding client.

      I think in the situation described above it would be necessary to forward the erroneous message to a DLQ instead and continue with the next message. Currently the message clogs the queue and needs to be deleted / moved manually in order for processing to continue.

      Attachments

        1. activemq_artemis.log
          31 kB
          Christian Danner

        Activity

          People

            jbertram Justin Bertram
            perdurabo Christian Danner
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 50m
                50m