ActiveMQ
  1. ActiveMQ
  2. AMQ-2103

Memory leak when marshaling ActiveMQTextMessage to persistent store

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.0.0
    • Fix Version/s: 5.4.2
    • Component/s: Broker
    • Labels:
      None
    • Environment:

      ActiveMQ 5.0.0.20-fuse

      Description

      When an org.apache.activemq.command.ActiveMQTextMessage is marshaled into the persistence store some portion of the messages are stored in memory (i.e. pending cursor/consumer dispatch queue). The messages stored in memory have the potential to cause the broker to run out of memory because org.apache.activemq.command.ActiveMQTextMessage objects can store the data twice, once in the 'text' field and once in the 'content' field. Normally this isn't a problem since the 'content' field is cleared when the message is being used in a client application (i.e. by calling getText() clears content). The problem occurs when a consumer is slow and a large number of messages are sitting around on the broker in pending/dispatch memory space. The message is marshaled for the store and then persisted to disk and copied to pending memory when space is available.

      This bug affects any ActiveMQ*Message object that does not clear its temporary data (i.e. 'text' field) once it has been marshaled. When a message is marshaled we should null the derived objects memory space once the data has been written to the parent object's 'content' field.

      1. Duplicate Message Data (Internal Marshalling).png
        77 kB
        Trevor Pounds
      2. jhat_ActiveMQTextMessage_0xe837a478.htm
        4 kB
        Trevor Pounds
      3. jhat_ByteSequence_0xe837a5c0.htm
        1 kB
        Trevor Pounds
      4. jhat_ByteSequence_data_0xe837adb0.htm
        6 kB
        Trevor Pounds
      5. AMQ-2103.diff
        0.5 kB
        Trevor Pounds
      6. heap_100_1MB_messages.png
        93 kB
        Trevor Pounds

        Activity

        Hide
        Trevor Pounds added a comment -

        attaching memory usage diagram showing how the message data can be duplicated.

        Show
        Trevor Pounds added a comment - attaching memory usage diagram showing how the message data can be duplicated.
        Hide
        Trevor Pounds added a comment -

        attaching jhat heap analysis showing ActiveMQTextMessage object with duplicate data.

        Show
        Trevor Pounds added a comment - attaching jhat heap analysis showing ActiveMQTextMessage object with duplicate data.
        Hide
        Trevor Pounds added a comment -

        attaching jhat heap analysis showing ActiveMQTextMessage->ByteSequence object holding the byte array data.

        Show
        Trevor Pounds added a comment - attaching jhat heap analysis showing ActiveMQTextMessage->ByteSequence object holding the byte array data.
        Hide
        Trevor Pounds added a comment -

        attaching jhat heap analysis showing ActiveMQTextMessage->ByteSequence->byte array object holding the message data.

        Show
        Trevor Pounds added a comment - attaching jhat heap analysis showing ActiveMQTextMessage->ByteSequence->byte array object holding the message data.
        Hide
        Trevor Pounds added a comment -

        attaching patch that clears ActiveMQTextMessage 'text' field data when marshaling.

        Show
        Trevor Pounds added a comment - attaching patch that clears ActiveMQTextMessage 'text' field data when marshaling.
        Hide
        Trevor Pounds added a comment -

        I also forgot to mention that this probably only affects use cases with something like vm transport where the message stays internal to the same JVM and is not marshaled across the wire.

        Show
        Trevor Pounds added a comment - I also forgot to mention that this probably only affects use cases with something like vm transport where the message stays internal to the same JVM and is not marshaled across the wire.
        Hide
        Trevor Pounds added a comment -

        attaching heap usage graph comparing memory before and after applying patch

        Show
        Trevor Pounds added a comment - attaching heap usage graph comparing memory before and after applying patch
        Hide
        Rob Davies added a comment -

        Patch applied in SVN revision 743258

        Show
        Rob Davies added a comment - Patch applied in SVN revision 743258
        Hide
        Trevor Pounds added a comment -

        Rob did you look at all to see if this affects other message types (i.e. ActiveMQObjectMessage, ActiveMQMapMessage)? I believe some of these these suffer from the same problem.

        Show
        Trevor Pounds added a comment - Rob did you look at all to see if this affects other message types (i.e. ActiveMQObjectMessage, ActiveMQMapMessage)? I believe some of these these suffer from the same problem.
        Hide
        Gary Tully added a comment -

        This fix breaks the topic case with multiple consumers and a vm transport producer. There is contention over marshaling so whacking the marshaled state is not possible, results in missing content for some of the consumers. https://issues.apache.org/activemq/browse/AMQ-2966

        Think the fix is to have an reduceMemoryFootprint policy option that when enabled for queues, will clear the marshaled state (can work for map and object messages also) when the message is first persisted. This is a natural sync point that is contention free provided the concurrentStoreAndDispatchQueue KahaDB option that tries to optimize out persistence, is not enabled.

        Show
        Gary Tully added a comment - This fix breaks the topic case with multiple consumers and a vm transport producer. There is contention over marshaling so whacking the marshaled state is not possible, results in missing content for some of the consumers. https://issues.apache.org/activemq/browse/AMQ-2966 Think the fix is to have an reduceMemoryFootprint policy option that when enabled for queues, will clear the marshaled state (can work for map and object messages also) when the message is first persisted. This is a natural sync point that is contention free provided the concurrentStoreAndDispatchQueue KahaDB option that tries to optimize out persistence, is not enabled.
        Hide
        Gary Tully added a comment -

        patch reverted and new policy based solution applied in r1021466

        The patch, setting the text attribute to null, was unprotected. Currently there are no destructive operations, on a message, allowing it to be shared across topic consumers without synchronization, which allows concurrent dispatch to be fast. This is important to maintain.

        A new destination policy for queues, boolean attribute reduceMemoryFootprint has been introduced to do meet this use case.
        When true, the marshaled state of a message will be cleared once the message is persisted in the store. This is a natural, contention free sync point.
        Note: The kahaDB concurrentStoreAndForwardQueues option (which introduces concurrency) and the memory persistence adapter (which will not marshall) are not compatible with this new policy.

        Trevor, can you validate that this change meet your use case?

        Show
        Gary Tully added a comment - patch reverted and new policy based solution applied in r1021466 The patch, setting the text attribute to null, was unprotected. Currently there are no destructive operations, on a message, allowing it to be shared across topic consumers without synchronization, which allows concurrent dispatch to be fast. This is important to maintain. A new destination policy for queues, boolean attribute reduceMemoryFootprint has been introduced to do meet this use case. When true, the marshaled state of a message will be cleared once the message is persisted in the store. This is a natural, contention free sync point. Note: The kahaDB concurrentStoreAndForwardQueues option (which introduces concurrency) and the memory persistence adapter (which will not marshall) are not compatible with this new policy. Trevor, can you validate that this change meet your use case?
        Hide
        Trevor Pounds added a comment -

        This fix looks like it will work but I do have some concerns implementing it as a destination policy. Wouldn't it make more sense to implement this using SoftReferences or a hook into the MarshallAware interface? In my opinion it makes sense to clear memory at the discretion of the collector when the message has been completely serialized to disk when there is not enough memory. The actual problem I was trying to solve was to prevent "hidden" memory from hanging around that was not tied to the ActiveMQ memory usage policies. I could be missing something here but it seems there are disjoint philosophies for memory management related to overall broker memory usage, fast dispatch cursors and other unaccounted for "live" objects. I'm not sure what to do here but the proposed fix (including my original submission) seems like a kludge at best. Does anyone else have other opinions?

        Show
        Trevor Pounds added a comment - This fix looks like it will work but I do have some concerns implementing it as a destination policy. Wouldn't it make more sense to implement this using SoftReferences or a hook into the MarshallAware interface? In my opinion it makes sense to clear memory at the discretion of the collector when the message has been completely serialized to disk when there is not enough memory. The actual problem I was trying to solve was to prevent "hidden" memory from hanging around that was not tied to the ActiveMQ memory usage policies. I could be missing something here but it seems there are disjoint philosophies for memory management related to overall broker memory usage, fast dispatch cursors and other unaccounted for "live" objects. I'm not sure what to do here but the proposed fix (including my original submission) seems like a kludge at best. Does anyone else have other opinions?
        Hide
        Gary Tully added a comment -

        I think there is merit in the original use case, but soft references would not cut it as the messages are retained in memory by the cursors and there is duplication. The key issue with any destruction of a message in memory is that it requires a sync point. A general MarshallAware interface hook would require a sync on message that would have a large impact on scaleability of topics.

        I agree, that there needs to be some tie with available vm memory and cursor memory usage, but that sort of scheme would require major surgery atm.

        Memory management is something that is central to the amq 6.0 (apollo) architecture.

        Show
        Gary Tully added a comment - I think there is merit in the original use case, but soft references would not cut it as the messages are retained in memory by the cursors and there is duplication. The key issue with any destruction of a message in memory is that it requires a sync point. A general MarshallAware interface hook would require a sync on message that would have a large impact on scaleability of topics. I agree, that there needs to be some tie with available vm memory and cursor memory usage, but that sort of scheme would require major surgery atm. Memory management is something that is central to the amq 6.0 (apollo) architecture.
        Hide
        Gary Tully added a comment -

        resolving.
        addressing any further limitations to the memory usage tracking as it pertains to actual jvm usage will be punted to 6.0

        Show
        Gary Tully added a comment - resolving. addressing any further limitations to the memory usage tracking as it pertains to actual jvm usage will be punted to 6.0

          People

          • Assignee:
            Gary Tully
            Reporter:
            Trevor Pounds
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development