Uploaded image for project: 'ActiveMQ Classic'
  1. ActiveMQ Classic
  2. AMQ-264

Durable subscriptions + Journal: causes sporatic "Message probably expired: null" error message

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.0
    • 3.1
    • Message Store
    • None
    • Journal+Derby

    Description

      Matt Sinclair <msd@msdservices.com> Reported on the user list:

      Hello,

      I have been experiencing problems with ActiveMQ-3.1M2 dropping messages on durable subscriptions.

      On a test submitting some 70,000 messages, each with approximately 1K payloads, the broker stalled somewhere around 53,911 for a couple minutes. The broker's process began consuming 50% or more CPU, and the client was basically idle. After a couple minutes the broker began delivering messages but at a much slower rate. In the end, only 68398 of the 70000 messages were delivered.
      In the error log there were exactly 1602 "Message probably expired: null" error messages. In terms of timing, I am not sure how these error messages correlate with the stall, though I believe the stall occurred soon after these messages.

      Immediately prior to this problem and the 1602 messages there were a series of JMS exceptions in the broker's log. The exceptions are detailed below.

      This problem seems to be reproducible, though it happens less frequently with smaller batches of messages. For example, sending batches of 10000, 30000, and 50000 generally work fine.

      However, in tests sending batches of 50000 messages, exceptions about not being able to checkpoint the database were seen, though all messages were delivered. What is interesting is that checkpoint failure seems to corresponded with the last message or second to last message in the batch. Stacktraces are listed below.

      The broker, client, and producer are running on the same machine, connected via TCP transport. System is Java 1.4.2 on Mac OS X 10.3.9. I can test Solaris/Sparc on Java 5 if that would be helpful.

      Some of the problems appear to be ones previously reported on this list. Obviously, the reliability of the broker is troubling. If there is anything I can do, by way of running special builds or running more tests or etc., I would be more than willing to invest some time and resources in helping to track down and understand this serious problem.

      Matt Sinclair

      Sending 70000 messages, only 68398 delviered
      ----------------------------------------------------------
      1117944936 20050605-04:15:36 :activemq: :E: :DBR0000: javax.jms.JMSException: Could not locate message in database with message id: ID:matthew-sinclair-day.local-63200-1117944927411-7:2000
      at org.activemq.store.jdbc.JDBCMessageStore.getMessageSequenceId(JDBCMessageStore.java:133)
      at
      org.activemq.store.jdbc.JDBCTopicMessageStore.setLastAcknowledgedMessageIdentity(JDBCTopicMessageStore.java:49)
      at
      org.activemq.store.journal.JournalTopicMessageStore$1.execute(JournalTopicMessageStore.java:111)
      at
      org.activemq.util.TransactionTemplate.run(TransactionTemplate.java:41)
      at org.activemq.store.journal.JournalTopicMessageStore.checkpoint(JournalTopicMessageStore.java:103)
      at
      org.activemq.store.journal.JournalPersistenceAdapter$3.run(JournalPersistenceAdapter.java:365)
      at
      EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(Unknown Source)
      at java.lang.Thread.run(Thread.java:552)

      1117945062 20050605-04:17:42 :activemq: :W: :DBR0000: Old message cleanup failed due to: SQL Exception: A lock could not be obtained within the time requested
      1117945062 20050605-04:17:42 :activemq: :I: :DBR0000: Old message cleanup failed due to: SQL Exception: A lock could not be obtained within the time requested
      1117945062 20050605-04:17:42 :activemq: :E: :DBR0000: ERROR 40XL1: A lock could not be obtained within the time requested
      at org.apache.derby.iapi.error.StandardException.newException(StandardException.java)
      at
      org.apache.derby.impl.services.locks.LockSet.lockObject(LockSet.java)
      at org.apache.derby.impl.services.locks.SinglePool.lockAnObject(SinglePool.java)
      at
      org.apache.derby.impl.services.locks.SinglePool.lockObject(SinglePool.java)
      at org.apache.derby.impl.store.raw.xact.RowLocking2.lockRecordForRead(RowLocking2.java)
      at
      org.apache.derby.impl.store.access.conglomerate.OpenConglomerate.lockPositionForRead(OpenConglomerate.java)
      at
      org.apache.derby.impl.store.access.conglomerate.GenericScanController.fetchRows(GenericScanController.java)
      at
      org.apache.derby.impl.store.access.heap.HeapScan.fetchNext(HeapScan.java)
      at org.apache.derby.impl.sql.execute.TableScanResultSet.getNextRowCore(TableScanResultSet.java)
      at
      org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(ProjectRestrictResultSet.java)
      at
      org.apache.derby.impl.sql.execute.DMLWriteResultSet.getNextRowCore(DMLWriteResultSet.java)
      at
      org.apache.derby.impl.sql.execute.DeleteResultSet.collectAffectedRows(DeleteResultSet.java)
      at
      org.apache.derby.impl.sql.execute.DeleteResultSet.open(DeleteResultSet.java)
      at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java)
      at
      org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java)
      at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(EmbedPreparedStatement.java)
      at
      org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(EmbedPreparedStatement.java)
      at
      org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:94)
      at
      org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:94)
      at
      org.activemq.store.jdbc.adapter.DefaultJDBCAdapter.doDeleteOldMessages(DefaultJDBCAdapter.java:574)
      at
      org.activemq.store.jdbc.JDBCPersistenceAdapter.cleanup(JDBCPersistenceAdapter.java:252)
      at
      org.activemq.store.jdbc.JDBCPersistenceAdapter$1.run(JDBCPersistenceAdapter.java:241)
      at
      EDU.oswego.cs.dl.util.concurrent.ClockDaemon$RunLoop.run(Unknown Source)
      at java.lang.Thread.run(Thread.java:552)

      1117945076 20050605-04:17:56 :activemq: :E: :DBR0000: Failed to checkpoint a message store: javax.jms.JMSException: Could not locate message in database with message id: ID:matthew-sinclair-day.local-63200-1117944927411-7:53000
      1117945076 20050605-04:17:56 :activemq: :E: :DBR0000: javax.jms.JMSException: Could not locate message in database with message id: ID:matthew-sinclair-day.local-63200-1117944927411-7:53000
      at org.activemq.store.jdbc.JDBCMessageStore.getMessageSequenceId(JDBCMessageStore.java:133)
      at
      org.activemq.store.jdbc.JDBCTopicMessageStore.setLastAcknowledgedMessageIdentity(JDBCTopicMessageStore.java:49)
      at
      org.activemq.store.journal.JournalTopicMessageStore$1.execute(JournalTopicMessageStore.java:111)
      at
      org.activemq.util.TransactionTemplate.run(TransactionTemplate.java:41)
      at org.activemq.store.journal.JournalTopicMessageStore.checkpoint(JournalTopicMessageStore.java:103)
      at
      org.activemq.store.journal.JournalPersistenceAdapter$3.run(JournalPersistenceAdapter.java:365)
      at
      EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(Unknown Source)
      at java.lang.Thread.run(Thread.java:552)
      1117945076 20050605-04:17:56 :activemq: :I: :DBR0000: Message probably expired: null
      1117945076 20050605-04:17:56 :activemq: :I: :DBR0000: Message probably expired: null
      [repeat another 1599 times]
      1117945113 20050605-04:18:33 :activemq: :I: :DBR0000: Message probably expired: null

      Sending 50000 messages, all delivered
      -----------------------------------------------
      1117930998 20050605-00:23:18 :activemq: :E: :DBR0000: Failed to checkpoint a message store: javax.jms.JMSException: Could not locate message in database with message id: ID:matthew-sinclair-day.local-62997-1117930799196-7:49999
      1117930998 20050605-00:23:18 :activemq: :E: :DBR0000: javax.jms.JMSException: Could not locate message in database with message id: ID:matthew-sinclair-day.local-62997-1117930799196-7:49999
      at org.activemq.store.jdbc.JDBCMessageStore.getMessageSequenceId(JDBCMessageStore.java:133)
      at
      org.activemq.store.jdbc.JDBCTopicMessageStore.setLastAcknowledgedMessageIdentity(JDBCTopicMessageStore.java:49)
      at
      org.activemq.store.journal.JournalTopicMessageStore$1.execute(JournalTopicMessageStore.java:111)
      at
      org.activemq.util.TransactionTemplate.run(TransactionTemplate.java:41)
      at org.activemq.store.journal.JournalTopicMessageStore.checkpoint(JournalTopicMessageStore.java:103)
      at
      org.activemq.store.journal.JournalPersistenceAdapter$3.run(JournalPersistenceAdapter.java:365)
      at
      EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(Unknown Source)
      at java.lang.Thread.run(Thread.java:552)

      Attachments

        Activity

          People

            chirino Hiram R. Chirino
            chirino Hiram R. Chirino
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: