Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.0
-
None
-
Journal+Derby
Description
Matt Sinclair <msd@msdservices.com> Reported on the user list:
Hello,
I have been experiencing problems with ActiveMQ-3.1M2 dropping messages on durable subscriptions.
On a test submitting some 70,000 messages, each with approximately 1K payloads, the broker stalled somewhere around 53,911 for a couple minutes. The broker's process began consuming 50% or more CPU, and the client was basically idle. After a couple minutes the broker began delivering messages but at a much slower rate. In the end, only 68398 of the 70000 messages were delivered.
In the error log there were exactly 1602 "Message probably expired: null" error messages. In terms of timing, I am not sure how these error messages correlate with the stall, though I believe the stall occurred soon after these messages.
Immediately prior to this problem and the 1602 messages there were a series of JMS exceptions in the broker's log. The exceptions are detailed below.
This problem seems to be reproducible, though it happens less frequently with smaller batches of messages. For example, sending batches of 10000, 30000, and 50000 generally work fine.
However, in tests sending batches of 50000 messages, exceptions about not being able to checkpoint the database were seen, though all messages were delivered. What is interesting is that checkpoint failure seems to corresponded with the last message or second to last message in the batch. Stacktraces are listed below.
The broker, client, and producer are running on the same machine, connected via TCP transport. System is Java 1.4.2 on Mac OS X 10.3.9. I can test Solaris/Sparc on Java 5 if that would be helpful.
Some of the problems appear to be ones previously reported on this list. Obviously, the reliability of the broker is troubling. If there is anything I can do, by way of running special builds or running more tests or etc., I would be more than willing to invest some time and resources in helping to track down and understand this serious problem.
Matt Sinclair
Sending 70000 messages, only 68398 delviered
----------------------------------------------------------
1117944936 20050605-04:15:36 :activemq: :E: :DBR0000: javax.jms.JMSException: Could not locate message in database with message id: ID:matthew-sinclair-day.local-63200-1117944927411-7:2000
at org.activemq.store.jdbc.JDBCMessageStore.getMessageSequenceId(JDBCMessageStore.java:133)
at
org.activemq.store.jdbc.JDBCTopicMessageStore.setLastAcknowledgedMessageIdentity(JDBCTopicMessageStore.java:49)
at
org.activemq.store.journal.JournalTopicMessageStore$1.execute(JournalTopicMessageStore.java:111)
at
org.activemq.util.TransactionTemplate.run(TransactionTemplate.java:41)
at org.activemq.store.journal.JournalTopicMessageStore.checkpoint(JournalTopicMessageStore.java:103)
at
org.activemq.store.journal.JournalPersistenceAdapter$3.run(JournalPersistenceAdapter.java:365)
at
EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(Unknown Source)
at java.lang.Thread.run(Thread.java:552)
1117945062 20050605-04:17:42 :activemq: :W: :DBR0000: Old message cleanup failed due to: SQL Exception: A lock could not be obtained within the time requested
1117945062 20050605-04:17:42 :activemq: :I: :DBR0000: Old message cleanup failed due to: SQL Exception: A lock could not be obtained within the time requested
1117945062 20050605-04:17:42 :activemq: :E: :DBR0000: ERROR 40XL1: A lock could not be obtained within the time requested
at org.apache.derby.iapi.error.StandardException.newException(StandardException.java)
at
org.apache.derby.impl.services.locks.LockSet.lockObject(LockSet.java)
at org.apache.derby.impl.services.locks.SinglePool.lockAnObject(SinglePool.java)
at
org.apache.derby.impl.services.locks.SinglePool.lockObject(SinglePool.java)
at org.apache.derby.impl.store.raw.xact.RowLocking2.lockRecordForRead(RowLocking2.java)
at
org.apache.derby.impl.store.access.conglomerate.OpenConglomerate.lockPositionForRead(OpenConglomerate.java)
at
org.apache.derby.impl.store.access.conglomerate.GenericScanController.fetchRows(GenericScanController.java)
at
org.apache.derby.impl.store.access.heap.HeapScan.fetchNext(HeapScan.java)
at org.apache.derby.impl.sql.execute.TableScanResultSet.getNextRowCore(TableScanResultSet.java)
at
org.apache.derby.impl.sql.execute.ProjectRestrictResultSet.getNextRowCore(ProjectRestrictResultSet.java)
at
org.apache.derby.impl.sql.execute.DMLWriteResultSet.getNextRowCore(DMLWriteResultSet.java)
at
org.apache.derby.impl.sql.execute.DeleteResultSet.collectAffectedRows(DeleteResultSet.java)
at
org.apache.derby.impl.sql.execute.DeleteResultSet.open(DeleteResultSet.java)
at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java)
at
org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java)
at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(EmbedPreparedStatement.java)
at
org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(EmbedPreparedStatement.java)
at
org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:94)
at
org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:94)
at
org.activemq.store.jdbc.adapter.DefaultJDBCAdapter.doDeleteOldMessages(DefaultJDBCAdapter.java:574)
at
org.activemq.store.jdbc.JDBCPersistenceAdapter.cleanup(JDBCPersistenceAdapter.java:252)
at
org.activemq.store.jdbc.JDBCPersistenceAdapter$1.run(JDBCPersistenceAdapter.java:241)
at
EDU.oswego.cs.dl.util.concurrent.ClockDaemon$RunLoop.run(Unknown Source)
at java.lang.Thread.run(Thread.java:552)
1117945076 20050605-04:17:56 :activemq: :E: :DBR0000: Failed to checkpoint a message store: javax.jms.JMSException: Could not locate message in database with message id: ID:matthew-sinclair-day.local-63200-1117944927411-7:53000
1117945076 20050605-04:17:56 :activemq: :E: :DBR0000: javax.jms.JMSException: Could not locate message in database with message id: ID:matthew-sinclair-day.local-63200-1117944927411-7:53000
at org.activemq.store.jdbc.JDBCMessageStore.getMessageSequenceId(JDBCMessageStore.java:133)
at
org.activemq.store.jdbc.JDBCTopicMessageStore.setLastAcknowledgedMessageIdentity(JDBCTopicMessageStore.java:49)
at
org.activemq.store.journal.JournalTopicMessageStore$1.execute(JournalTopicMessageStore.java:111)
at
org.activemq.util.TransactionTemplate.run(TransactionTemplate.java:41)
at org.activemq.store.journal.JournalTopicMessageStore.checkpoint(JournalTopicMessageStore.java:103)
at
org.activemq.store.journal.JournalPersistenceAdapter$3.run(JournalPersistenceAdapter.java:365)
at
EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(Unknown Source)
at java.lang.Thread.run(Thread.java:552)
1117945076 20050605-04:17:56 :activemq: :I: :DBR0000: Message probably expired: null
1117945076 20050605-04:17:56 :activemq: :I: :DBR0000: Message probably expired: null
[repeat another 1599 times]
1117945113 20050605-04:18:33 :activemq: :I: :DBR0000: Message probably expired: null
Sending 50000 messages, all delivered
-----------------------------------------------
1117930998 20050605-00:23:18 :activemq: :E: :DBR0000: Failed to checkpoint a message store: javax.jms.JMSException: Could not locate message in database with message id: ID:matthew-sinclair-day.local-62997-1117930799196-7:49999
1117930998 20050605-00:23:18 :activemq: :E: :DBR0000: javax.jms.JMSException: Could not locate message in database with message id: ID:matthew-sinclair-day.local-62997-1117930799196-7:49999
at org.activemq.store.jdbc.JDBCMessageStore.getMessageSequenceId(JDBCMessageStore.java:133)
at
org.activemq.store.jdbc.JDBCTopicMessageStore.setLastAcknowledgedMessageIdentity(JDBCTopicMessageStore.java:49)
at
org.activemq.store.journal.JournalTopicMessageStore$1.execute(JournalTopicMessageStore.java:111)
at
org.activemq.util.TransactionTemplate.run(TransactionTemplate.java:41)
at org.activemq.store.journal.JournalTopicMessageStore.checkpoint(JournalTopicMessageStore.java:103)
at
org.activemq.store.journal.JournalPersistenceAdapter$3.run(JournalPersistenceAdapter.java:365)
at
EDU.oswego.cs.dl.util.concurrent.QueuedExecutor$RunLoop.run(Unknown Source)
at java.lang.Thread.run(Thread.java:552)