ActiveMQ
  1. ActiveMQ
  2. AMQ-1797

persistent messages for durable subscribers are not purged from disc storage

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.1.0
    • Fix Version/s: 5.3.0
    • Component/s: Message Store
    • Labels:
      None
    • Environment:

      WinXP,
      java version "1.6.0_05"
      Java(TM) SE Runtime Environment (build 1.6.0_05-b13)
      Java HotSpot(TM) Client VM (build 10.0-b19, mixed mode, sharing)

      Description

      I have some problems with multiple durable subscribers while persistent delivery is used.
      Message should be deleted from disc storage (periodically) after all interested subscribers have received it. In case there was one durable subscriber everything woks just fine. However when I tested this scenerio with two durable subscribers on topic, the messages were never deleted (disc usage is continously consumed and never purged).

      Test case description:
      -using default activemq.xml configuration file from binary distribution
      -TopicConsumers.java creates two separetes durable consumers
      -textMessageFile.txt is file which text is sent by producer (place it on classpath)

      Result:
      -all messages there were send were also received by both subscribers
      -after more then 100k messages data directory is about 1GB.
      -restarting activemq with so big persistent storage takes very long time

      1. ASF.LICENSE.NOT.GRANTED--TopicConsumers.java
        3 kB
        Juraj Kuruc
      2. ASF.LICENSE.NOT.GRANTED--TopicProducer.java
        4 kB
        Juraj Kuruc
      3. ASF.LICENSE.NOT.GRANTED--textMessageFile.txt
        9 kB
        Juraj Kuruc
      4. Storage_cleanup_Redelivery_test.zip
        9 kB
        Juraj Kuruc
      5. kaha_amq.zip
        9 kB
        Juraj Kuruc
      6. activemq.xml
        5 kB
        Juraj Kuruc

        Activity

        Hide
        Juraj Kuruc added a comment -

        It looks that there is a problem in activemq journal wich is used by default amqPersistenceAdapter. I have tested this also on amq 4.1.1 journaledJDBC adapter with same bug.
        When jdbcPersistenceAdapter with 4.1.1(non-journal) or kahaPersistenceAdapter with 5.1 are used with 2 durable topic subscribers, disc storage is periodically emptied.

        Show
        Juraj Kuruc added a comment - It looks that there is a problem in activemq journal wich is used by default amqPersistenceAdapter. I have tested this also on amq 4.1.1 journaledJDBC adapter with same bug. When jdbcPersistenceAdapter with 4.1.1(non-journal) or kahaPersistenceAdapter with 5.1 are used with 2 durable topic subscribers, disc storage is periodically emptied.
        Hide
        Rob Davies added a comment - - edited

        Fixed by SVN revision 667752 and revision 667756

        Show
        Rob Davies added a comment - - edited Fixed by SVN revision 667752 and revision 667756
        Hide
        Juraj Kuruc added a comment - - edited

        I've tested AMQ 5.2 SNAPSHOT for this fix with partial success. There isn't any problem every message from Topic is acknoledged but solution doesn't work when some of messages are regulary received by durable subscribers and some of them are redeliverd till they get to ActiveMQ.DLQ due to exceeding max redelivery count.
        To duplicate error see attached zip .

        -TopicRedeliver creates two separetes durable consumers, which consumes every odd message and recover session if message number is even. Parameter "jms.redeliveryPolicy.maximumRedeliveries=1" causes that after first redelivery message goes to dead letter queue.
        -DlqConsumer consumes messages from ActiveMQ.DLQ
        -TopicProducer sends messages to testTopic

        Reproducing steps:
        1. run DlqConsumer
        2. run TopicRedeliver
        3. run TopicProducer

        Results:
        1.
        %ACTIVEMQ_HOME%/data/journal dir contains several data-x files. More messages you send more data-x files stays in mentioned dir and consumes disc space.(however some of them are deleted this behaviour isn't predictible)
        Something is going obviously wrong because all messages are consumed from destinations (testTopic and ActiveMQ.DLQ).
        2.
        Size of ActiveMQ.DLQ shown in jConsole is possitive number, despite that DlqConsumer doesn't consume messages anymore.
        3.
        If TopicRedeliver and DlqConsumer are restarted, any further sending of messages may causes folowing exception:
        ERROR RecoveryListenerAdapter - Message id ID:sk1d069c-4253-1225900870814-0:0:1:1:42 could not be recovered from the data store - already dispatched

        Using AMQ 5.2. stable release is even not possible run this test - after couple hundred messages one of durable subscriber get stuck (see https://issues.apache.org/activemq/browse/AMQ-2021 ) .

        Show
        Juraj Kuruc added a comment - - edited I've tested AMQ 5.2 SNAPSHOT for this fix with partial success. There isn't any problem every message from Topic is acknoledged but solution doesn't work when some of messages are regulary received by durable subscribers and some of them are redeliverd till they get to ActiveMQ.DLQ due to exceeding max redelivery count. To duplicate error see attached zip . -TopicRedeliver creates two separetes durable consumers, which consumes every odd message and recover session if message number is even. Parameter "jms.redeliveryPolicy.maximumRedeliveries=1" causes that after first redelivery message goes to dead letter queue. -DlqConsumer consumes messages from ActiveMQ.DLQ -TopicProducer sends messages to testTopic Reproducing steps: 1. run DlqConsumer 2. run TopicRedeliver 3. run TopicProducer Results: 1. %ACTIVEMQ_HOME%/data/journal dir contains several data-x files. More messages you send more data-x files stays in mentioned dir and consumes disc space.(however some of them are deleted this behaviour isn't predictible) Something is going obviously wrong because all messages are consumed from destinations (testTopic and ActiveMQ.DLQ). 2. Size of ActiveMQ.DLQ shown in jConsole is possitive number, despite that DlqConsumer doesn't consume messages anymore. 3. If TopicRedeliver and DlqConsumer are restarted, any further sending of messages may causes folowing exception: ERROR RecoveryListenerAdapter - Message id ID:sk1d069c-4253-1225900870814-0:0:1:1:42 could not be recovered from the data store - already dispatched Using AMQ 5.2. stable release is even not possible run this test - after couple hundred messages one of durable subscriber get stuck (see https://issues.apache.org/activemq/browse/AMQ-2021 ) .
        Hide
        Juraj Kuruc added a comment -

        Tests on amq revision number 729038 with amqPersistenceAdapter as well as with kahaPersistenceAdapter but none of them was sucessful
        from different reasons.

        With amqPersistenceAdapter, 100k msgs have been sent and data directory size is 1.1 GB.
        Further message sending make any difference - message store still grows. For addition following error occures several times:
        ERROR RecoveryListenerAdapter - Message id ... could not be recovered from the data store - already dispatched

        Using kahaPersistenceAdapter, after 100k msgs one of the consumer receives 100k second 99999 msgs.After another 300k msgs data directory size is 410MB. Also some errors are logged by AMQ:

        ERROR DataManagerImpl - Looking for key 15 but not found in fileMap:

        {17=data-queue-data-17 number = 17 , length = 177868 refCount = 6, 16=data-queue-data-16 number = 16 , length = 33546816 refCount = 86}

        ERROR MapContainerImpl - Failed to get value for offset=4845, key=(15, 29641420, 47), value=(15, 29641472, 8839), previousItem=-1, nextItem=-1
        java.io.IOException: Could not locate data file data-queue-data-15
        at org.apache.activemq.kaha.impl.data.DataManagerImpl.getDataFile(DataManagerImpl.java:129)
        at org.apache.activemq.kaha.impl.data.SyncDataFileReader.readItem(SyncDataFileReader.java:65)
        at org.apache.activemq.kaha.impl.data.DataManagerImpl.readItem(DataManagerImpl.java:141)
        at org.apache.activemq.kaha.impl.container.MapContainerImpl.getValue(MapContainerImpl.java:481)
        at org.apache.activemq.store.kahadaptor.KahaMessageStore.recoverNextMessages(KahaMessageStore.java:166)
        at org.apache.activemq.store.ProxyMessageStore.recoverNextMessages(ProxyMessageStore.java:83)
        at org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBatch(QueueStorePrefetch.java:92)
        at org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(AbstractStoreCursor.java:236)
        at org.apache.activemq.broker.region.cursors.AbstractStoreCursor.hasNext(AbstractStoreCursor.java:136)
        at org.apache.activemq.broker.region.cursors.StoreQueueCursor.hasNext(StoreQueueCursor.java:131)
        at org.apache.activemq.broker.region.Queue.doPageIn(Queue.java:1180)
        at org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:1308)
        at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1008)
        at org.apache.activemq.thread.DeterministicTaskRunner.runTask(DeterministicTaskRunner.java:84)
        at org.apache.activemq.thread.DeterministicTaskRunner$1.run(DeterministicTaskRunner.java:41)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:619)
        ERROR AbstractStoreCursor - Failed to fill batch for more see atached log - kaha_amq.log

        For this tests it was necessary to adjust memory usage and limit settings - view attached activemq.xml.
        Tested on SUSE Linux Enterprise Server 10.

        Show
        Juraj Kuruc added a comment - Tests on amq revision number 729038 with amqPersistenceAdapter as well as with kahaPersistenceAdapter but none of them was sucessful from different reasons. With amqPersistenceAdapter, 100k msgs have been sent and data directory size is 1.1 GB. Further message sending make any difference - message store still grows. For addition following error occures several times: ERROR RecoveryListenerAdapter - Message id ... could not be recovered from the data store - already dispatched Using kahaPersistenceAdapter, after 100k msgs one of the consumer receives 100k second 99999 msgs.After another 300k msgs data directory size is 410MB. Also some errors are logged by AMQ: ERROR DataManagerImpl - Looking for key 15 but not found in fileMap: {17=data-queue-data-17 number = 17 , length = 177868 refCount = 6, 16=data-queue-data-16 number = 16 , length = 33546816 refCount = 86} ERROR MapContainerImpl - Failed to get value for offset=4845, key=(15, 29641420, 47), value=(15, 29641472, 8839), previousItem=-1, nextItem=-1 java.io.IOException: Could not locate data file data-queue-data-15 at org.apache.activemq.kaha.impl.data.DataManagerImpl.getDataFile(DataManagerImpl.java:129) at org.apache.activemq.kaha.impl.data.SyncDataFileReader.readItem(SyncDataFileReader.java:65) at org.apache.activemq.kaha.impl.data.DataManagerImpl.readItem(DataManagerImpl.java:141) at org.apache.activemq.kaha.impl.container.MapContainerImpl.getValue(MapContainerImpl.java:481) at org.apache.activemq.store.kahadaptor.KahaMessageStore.recoverNextMessages(KahaMessageStore.java:166) at org.apache.activemq.store.ProxyMessageStore.recoverNextMessages(ProxyMessageStore.java:83) at org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBatch(QueueStorePrefetch.java:92) at org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(AbstractStoreCursor.java:236) at org.apache.activemq.broker.region.cursors.AbstractStoreCursor.hasNext(AbstractStoreCursor.java:136) at org.apache.activemq.broker.region.cursors.StoreQueueCursor.hasNext(StoreQueueCursor.java:131) at org.apache.activemq.broker.region.Queue.doPageIn(Queue.java:1180) at org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:1308) at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1008) at org.apache.activemq.thread.DeterministicTaskRunner.runTask(DeterministicTaskRunner.java:84) at org.apache.activemq.thread.DeterministicTaskRunner$1.run(DeterministicTaskRunner.java:41) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) ERROR AbstractStoreCursor - Failed to fill batch for more see atached log - kaha_amq.log For this tests it was necessary to adjust memory usage and limit settings - view attached activemq.xml. Tested on SUSE Linux Enterprise Server 10.
        Hide
        Juraj Kuruc added a comment -

        config file used to test rev. number 729038

        Show
        Juraj Kuruc added a comment - config file used to test rev. number 729038
        Hide
        Gary Tully added a comment - - edited

        the fix for http://issues.apache.org/activemq/browse/AMQ-2123 will help with the "could not be recovered from the data store - already dispatched" issue. This should also help with store management as it ensures that references are correctly cleaned up.

        Show
        Gary Tully added a comment - - edited the fix for http://issues.apache.org/activemq/browse/AMQ-2123 will help with the "could not be recovered from the data store - already dispatched" issue. This should also help with store management as it ensures that references are correctly cleaned up.
        Hide
        Juraj Kuruc added a comment -

        I've tested amq revision number 747951. Cleaning storage using amqPersistenceAdapter is better now, but there are still files that, in my opinion, need to be deleted - in data/journal/ directory after receiving sotps there are still files data-47 and data-87 and no other number between (I guess that in optimal state should be file numbering continous).
        There is also problem that from 100k msgs sent message have been only 97208 acknowledged (it should be 100k on both receivers, while every odd message is ack) and "already dispatched" error is still there.

        with kahaPersistenceAdapter is behaviour very similar as before.

        Show
        Juraj Kuruc added a comment - I've tested amq revision number 747951. Cleaning storage using amqPersistenceAdapter is better now, but there are still files that, in my opinion, need to be deleted - in data/journal/ directory after receiving sotps there are still files data-47 and data-87 and no other number between (I guess that in optimal state should be file numbering continous). There is also problem that from 100k msgs sent message have been only 97208 acknowledged (it should be 100k on both receivers, while every odd message is ack) and "already dispatched" error is still there. with kahaPersistenceAdapter is behaviour very similar as before.
        Hide
        ying added a comment -

        test latest trunk 756591

        in network of broker setup( 4 pair master/slave), if I kill TopicConsumers during processing messages and restart it, i will see "Message id ... could not be recovered from the data store - already dispatched"

        i also see "ERROR MasterBroker - Slave Failed
        javax.jms.JMSException: Unmatched acknowledege: MessageAck

        {commandId = 62608, responseRequired = true, ackType = 2, consumerId = ID:host-2343-1237580790173-0:1:1:1, firstMessageId = ID:host-2341-1237580773986-0:0:1:1:1341, lastMessageId = ID:host-2341-1237580773986-0:0:1:1:1341, destination = topic://testTopic, transactionId = null, messageCount = 1}

        ; Could not find Message-ID ID:host-2341-1237580773986-0:0:1:1:1341 in dispatched-list (start of ack)"

        one pair of master/slave, the results are much better. topic consumer seems to recover and continue processing

        Show
        ying added a comment - test latest trunk 756591 in network of broker setup( 4 pair master/slave), if I kill TopicConsumers during processing messages and restart it, i will see "Message id ... could not be recovered from the data store - already dispatched" i also see "ERROR MasterBroker - Slave Failed javax.jms.JMSException: Unmatched acknowledege: MessageAck {commandId = 62608, responseRequired = true, ackType = 2, consumerId = ID:host-2343-1237580790173-0:1:1:1, firstMessageId = ID:host-2341-1237580773986-0:0:1:1:1341, lastMessageId = ID:host-2341-1237580773986-0:0:1:1:1341, destination = topic://testTopic, transactionId = null, messageCount = 1} ; Could not find Message-ID ID:host-2341-1237580773986-0:0:1:1:1341 in dispatched-list (start of ack)" one pair of master/slave, the results are much better. topic consumer seems to recover and continue processing
        Hide
        Dejan Bosanac added a comment -

        Some of the commits for the AMQ-2303 seems to fixed this one as well (at least as far as I can test it). Please reopen if doesn't work with svn revision 790113

        Show
        Dejan Bosanac added a comment - Some of the commits for the AMQ-2303 seems to fixed this one as well (at least as far as I can test it). Please reopen if doesn't work with svn revision 790113

          People

          • Assignee:
            Dejan Bosanac
            Reporter:
            Juraj Kuruc
          • Votes:
            5 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development