Uploaded image for project: 'ActiveMQ Classic'
  1. ActiveMQ Classic
  2. AMQ-1686

Small window in wakeup logic for PooledTaskRunner - task can get executed in parallell

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 5.0.0
    • 5.1.0
    • Broker
    • None
    • windows XP

    • Patch Available

    Description

      org.apache.activemq.broker.region.cursors.CursorDurableTest fails on windows sometimes with the error:

      Exception in thread "Persistence Adaptor Task" java.lang.NullPointerException
      at org.apache.activemq.store.amq.AMQMessageStore$4.execute(AMQMessageStore.java:381)
      at org.apache.activemq.util.TransactionTemplate.run(TransactionTemplate.java:44)
      at org.apache.activemq.store.amq.AMQMessageStore.doAsyncWrite(AMQMessageStore.java:374)
      at org.apache.activemq.store.amq.AMQMessageStore.asyncWrite(AMQMessageStore.java:341)
      at org.apache.activemq.store.amq.AMQMessageStore$1.iterate(AMQMessageStore.java:95)
      at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122)
      at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
      at java.lang.Thread.run(Thread.java:595)
      Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 303.672 sec <<< FAILURE!

      The problem appears to be in the interaction between wakup and runTask in PooledTaskRunner
      iterating is set to false in a finally and queued is checked in a separate sync block.
      if wakeup is called in this window, it can set queued and find iterating false so it will execute, and runTask will find queued true and it too will execute.

      the fix is to include the queued check in the finally block so that iterating and queued are checked at the same time. I will attach a patch with this fix.
      I attempted to reproduce this problem with a unit test but I did not have any real success. the window is quite small. I will include the unit test in case it can be improved upon.

      chirino merged a fix yesterday that addresses the symptom of this issue in a different way,
      http://svn.apache.org/viewvc?view=rev&revision=650956

      The added synchronisation means that parallel calls by the PooledTaskRunner.asyncWrite are serialised on the method access.
      This fix addresses the route cause and can negate the need for the synchronisation.

      fyi: In the test, the paralell calls can come from flush() and from the asyncWrite task.

      Attachments

        1. AMQ-1689.patch
          4 kB
          Gary Tully

        Activity

          People

            chirino Hiram R. Chirino
            gtully Gary Tully
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: