Uploaded image for project: 'ActiveMQ Artemis'
  1. ActiveMQ Artemis
  2. ARTEMIS-3809

LargeMessageControllerImpl hangs the message consume

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.21.0
    • 2.29.0
    • Broker
    • OS: Windows Server 2019
      JVM: OpenJDK 64-Bit Server VM Temurin-17.0.1+12
      Max Memory (-Xmx): 6GB
      Allocated to JVM: 4.168GB
      Currently in use: 3.398GB  (heap 3.391GB, non-heap 0.123GB)

    Description

      I wondered if this might be a recurrence of issue ARTEMIS-2293 but this happens on 2.21.0 and I can see the code change in LargeMessageControllerImpl.  

      Using the default min-large-message-size of 100K. (defaults)

      Many messages are passing through the broker when this happens.  I would anticipate that most of the messages are smaller than 100K, but clearly some of them must exceed.  After some number of messages, a particular consumer ceases to consume messages.

      After the system became "hung" I was able to get a stack trace and I was able to identify that the system is stuck in an Object.wait() for a notify that appears to never come.

      Here is the trace I was able to capture:

      Thread-2 (ActiveMQ-client-global-threads) id=78 state=TIMED_WAITING
          - waiting on <0x43523a75> (a org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl)
          - locked <0x43523a75> (a org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl)
          at  java.base@17.0.1/java.lang.Object.wait(Native Method)
          at org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl.waitCompletion(LargeMessageControllerImpl.java:294)
          at org.apache.activemq.artemis.core.client.impl.LargeMessageControllerImpl.saveBuffer(LargeMessageControllerImpl.java:268)
          at org.apache.activemq.artemis.core.client.impl.ClientLargeMessageImpl.checkBuffer(ClientLargeMessageImpl.java:157)
          at org.apache.activemq.artemis.core.client.impl.ClientLargeMessageImpl.getBodyBuffer(ClientLargeMessageImpl.java:89)
          at mypackage.MessageListener.handleMessage(MessageListener.java:46)
      
      

       

      The app can run either as a single node using the InVM transporter or as a cluster using the TCP.  To my knowledge, I have only seen this issue occur on the InVM. 

      I am not expert in this code, but I can tell from the call stack that 0 must be the value of timeWait passed into waitCompletion().  But from what I can discern of the code changes in 2.21.0,  it should be adjusting the readTimeout to the timeout of the message (I think?) such that it causes the read to eventually give up rather than remaining blocked forever.

      We have persistenceEnabled = false, which leads me to believe that the only disk activity  for messages should be related to large messages.  

      On a machine and context where this was consistently happening, I adjusted the min-large-message-size upwards and the problem went away.   This makes sense for my application, but ultimately if a message goes across the threshold to become large it appears to hang the consumer indefinitely. 

      Attachments

        1. image-2022-05-03-10-51-46-872.png
          151 kB
          David Bennion

        Activity

          People

            clebertsuconic Clebert Suconic
            funkyjive David Bennion
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: