Issue Details (XML | Word | Printable)

Key: AMQ-1918
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Critical Critical
Assignee: Rob Davies
Reporter: Richard Yarger
Votes: 3
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
ActiveMQ

AbstractStoreCursor.size gets out of synch with Store size and blocks consumers

Created: 28/Aug/08 07:28 AM   Updated: 26/May/09 01:50 AM
Return to search
Component/s: Message Store
Affects Version/s: 5.1.0
Fix Version/s: 5.3.0

Time Tracking:
Not Specified

File Attachments:
  Size
XML File activemq.xml 2008-08-28 07:28 AM Richard Yarger 6 kB
Java Source File Licensed for inclusion in ASF works NegativeQueueCursorSupport.java 2009-03-30 09:58 AM Richard Yarger 15 kB
Zip Archive testAMQMessageStore.zip 2008-08-28 03:16 PM Richard Yarger 4.52 MB
Zip Archive Licensed for inclusion in ASF works testdata.zip 2008-09-24 03:59 AM Nicusor Tanase 156 kB


 Description  « Hide
In version 5.1.0, we are seeing our queue consumers stop consuming for no reason.
We have a staged queue environment and we occasionally see one queue display negative pending message counts that hang around -x, rise to -x+n gradually and then fall back to -x abruptly. The messages are building up and being processed in bunches but its not easy to see because the counts are negative. We see this behavior in the messages coming out of the system. Outbound messages come out in bunches and are synchronized with the queue pending count dropping to -x.

This issue does not happen ALL of the time. It happens about once a week and the only way to fix it is to bounce the broker. It doesn't happen to the same queue everytime, so it is not our consuming code.

Although we don't have a reproducible scenario, we have been able to debug the issue in our test environment.
We traced the problem to the cached store size in the AbstractStoreCursor.
This value becomes 0 or negative and prevents the AbstractStoreCursor from retrieving more messages from the store. (see AbstractStoreCursor.fillBatch() )
We have seen size value go lower than -1000.
We have also forced it to fix itself by sending in n+1 messages. Once the size goes above zero, the cached value is refreshed and things work ok again.
Unfortunately, during low volume times, it could be hours before n+1 messages are received, so our message latency can rise during low volume times....

I have attached our broker config.



 All   Comments   Work Log   Change History   Subversion Commits   FishEye   Crucible      Sort Order: Ascending order - Click to sort in descending order
Richard Yarger made changes - 28/Aug/08 03:16 PM
Field Original Value New Value
Attachment testAMQMessageStore.zip [ 16921 ]
Rob Davies made changes - 30/Aug/08 11:54 PM
Assignee Rob Davies [ rajdavies ]
Rob Davies made changes - 04/Sep/08 07:34 AM
Status Open [ 1 ] Resolved [ 5 ]
Resolution Fixed [ 1 ]
Fix Version/s 5.2.0 [ 11841 ]
Richard Yarger made changes - 05/Sep/08 08:55 AM
Resolution Fixed [ 1 ]
Status Resolved [ 5 ] Reopened [ 4 ]
Gary Tully made changes - 08/Sep/08 07:54 AM
Fix Version/s 5.2.0 [ 11841 ]
Fix Version/s 5.3.0 [ 11914 ]
Nicusor Tanase made changes - 24/Sep/08 03:59 AM
Attachment testdata.zip [ 17018 ]
Rob Davies made changes - 28/Dec/08 03:13 PM
Status Reopened [ 4 ] Resolved [ 5 ]
Resolution Fixed [ 1 ]
Richard Yarger made changes - 13/Jan/09 01:40 PM
Status Resolved [ 5 ] Reopened [ 4 ]
Resolution Fixed [ 1 ]
Richard Yarger made changes - 30/Mar/09 09:58 AM
Attachment NegativeQueueCursorSupport.java [ 17860 ]
Rob Davies made changes - 26/May/09 01:50 AM
Resolution Fixed [ 1 ]
Status Reopened [ 4 ] Resolved [ 5 ]