Qpid
  1. Qpid
  2. QPID-4171

persistent messages from a non-transactional session can be enqueued out of order

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.16
    • Fix Version/s: 0.18
    • Component/s: Java Broker
    • Labels:
      None

      Description

      A defect in AsyncAutoCommitTransaction means that occasionally messages delivered by a non-transactional JMS are seen to be enqueued out-of-order and therefore delivered to a consumer out of the expected order.

      It is this problem that underlies QPID-4014, although as I think this problem is wider than ConflationQueues, I am raising this separately.

        Issue Links

          Activity

          Keith Wall created issue -
          Hide
          Keith Wall added a comment - - edited

          Background:

          AsyncAutoCommitTransaction#addFuture() is used by #enqueue() to arrange for the postTransactionAction to be executed as each store transaction is completed asynchronously by the (BDB store's) commit-thread.

          An optimisation in #addFuture means that an enqueue can 'jump-the-queue', and be enqueued before others that are already waiting in the future recorder list.

          A problematic sequence would be:

          1) Message 1 and Message 2 arrive from the wire (AMQProtocolEngine#received)
          2) Message 1 begins processing. AsyncAutoCommitTransaction.enqueue called, but as commit thread is running behind, its enqueue action joins list of the future recorder (AMQChannel#_unfinishedCommandsQueue).
          3) BDB commit thread catches up the backlog
          4) Message 2 begins processing. AsyncAutoCommitTransaction.enqueue called but as the StoreFuture is already completed, its enqueue action is executed immediately so Message 2 is enqueued.
          5) AMQProtocolEngine#complete calls receiveComplete() completing the actions in _unfinishedCommandsQueue. Message 1 is therefore enqueued.
          6) Message 1 and Message 2 are now out of order. Message 2 will be delivered to a Consumer before Message 1.

              private void addFuture(final StoreFuture future, final Action action)
              {
                  if(action != null)
                  {
                      if(future.isComplete())
                      {
                          action.postCommit();
                      }
                      else
                      {
                          _futureRecorder.recordFuture(future, action);
                      }
                  }
              }
          
          Show
          Keith Wall added a comment - - edited Background: AsyncAutoCommitTransaction#addFuture() is used by #enqueue() to arrange for the postTransactionAction to be executed as each store transaction is completed asynchronously by the (BDB store's) commit-thread. An optimisation in #addFuture means that an enqueue can 'jump-the-queue', and be enqueued before others that are already waiting in the future recorder list. A problematic sequence would be: 1) Message 1 and Message 2 arrive from the wire (AMQProtocolEngine#received) 2) Message 1 begins processing. AsyncAutoCommitTransaction.enqueue called, but as commit thread is running behind, its enqueue action joins list of the future recorder (AMQChannel#_unfinishedCommandsQueue). 3) BDB commit thread catches up the backlog 4) Message 2 begins processing. AsyncAutoCommitTransaction.enqueue called but as the StoreFuture is already completed, its enqueue action is executed immediately so Message 2 is enqueued. 5) AMQProtocolEngine#complete calls receiveComplete() completing the actions in _unfinishedCommandsQueue. Message 1 is therefore enqueued. 6) Message 1 and Message 2 are now out of order. Message 2 will be delivered to a Consumer before Message 1. private void addFuture( final StoreFuture future , final Action action) { if (action != null ) { if ( future .isComplete()) { action.postCommit(); } else { _futureRecorder.recordFuture( future , action); } } }
          Keith Wall made changes -
          Field Original Value New Value
          Description A defect in AsyncAutoCommitTransaction means that occasionally messages delivered by a non-transactional JMS are seen to be enqueued out-of-order and therefore delivered to a consumer out of the expected order.

          It is this problem that underlies QPID-4014, although as I think this problem is wider than ConflationQueues, I am raising this separately.



          A defect in AsyncAutoCommitTransaction means that occasionally messages delivered by a non-transactional JMS are seen to be enqueued out-of-order and therefore delivered to a consumer out of the expected order.

          It is this problem that underlies QPID-4014, although as I think this problem is wider than ConflationQueues, I am raising this separately.




          Component/s Java Broker BDB Store [ 12315809 ]
          Keith Wall made changes -
          Link This issue is related too QPID-4014 [ QPID-4014 ]
          Robbie Gemmell made changes -
          Summary Messages from a non-transactional session can be enqueued out of order persistent messages from a non-transactional session can be enqueued out of order
          Robbie Gemmell made changes -
          Attachment QPID-4171-scribbles.txt [ 12538400 ]
          Hide
          Robbie Gemmell added a comment -

          Renamed JIRA to indicate only persistent messages can be enqueued out of order, transient messages will always have an immediate future assigned and will thus enqueue in-order. It is possible for transient messages to overtake persistent messages, but thats is allowable (though we should have given people the option to prevent it).

          Attached are some helpful scribbles for anyone working on this. It includes a trivial potential fix (&& !persistent), along with some dead code removal, and addition of a noddy system property to escape the previously mentioned 'transient can overtake persistent' behaviour when mixing delivery modes. It semi-duplicates the existing addFuture() method because thats used by more than just the enqueue methods for things that care nothing of whether its persistent message or whether strict ordering is desired, so it seemed like it would be a bit nasty just feeding default parameter values to a shared implementation. This isn't really a patch as such, it hasn't even been compiled properly nevermind tested, but I think it should work

          Show
          Robbie Gemmell added a comment - Renamed JIRA to indicate only persistent messages can be enqueued out of order, transient messages will always have an immediate future assigned and will thus enqueue in-order. It is possible for transient messages to overtake persistent messages, but thats is allowable (though we should have given people the option to prevent it). Attached are some helpful scribbles for anyone working on this. It includes a trivial potential fix (&& !persistent), along with some dead code removal, and addition of a noddy system property to escape the previously mentioned 'transient can overtake persistent' behaviour when mixing delivery modes. It semi-duplicates the existing addFuture() method because thats used by more than just the enqueue methods for things that care nothing of whether its persistent message or whether strict ordering is desired, so it seemed like it would be a bit nasty just feeding default parameter values to a shared implementation. This isn't really a patch as such, it hasn't even been compiled properly nevermind tested, but I think it should work
          Hide
          Robbie Gemmell added a comment -

          (I could probably be persuaded to reverse the system property so that people have the option to turn that behaviour on rather than off, since out blocking IO layer probably means it makes only a small difference to overall performance currently because we have to complete everything synchronously when the read is finished.)

          Show
          Robbie Gemmell added a comment - (I could probably be persuaded to reverse the system property so that people have the option to turn that behaviour on rather than off, since out blocking IO layer probably means it makes only a small difference to overall performance currently because we have to complete everything synchronously when the read is finished.)
          Philip Harvey made changes -
          Assignee Keith Wall [ k-wall ] Philip Harvey [ philharveyonline ]
          Philip Harvey made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Hide
          Philip Harvey added a comment -

          attached patch

          Show
          Philip Harvey added a comment - attached patch
          Philip Harvey made changes -
          Philip Harvey made changes -
          Status In Progress [ 3 ] Ready To Review [ 10006 ]
          Hide
          Philip Harvey added a comment -

          please commit the patch I attached today

          Show
          Philip Harvey added a comment - please commit the patch I attached today
          Philip Harvey made changes -
          Assignee Philip Harvey [ philharveyonline ] Keith Wall [ k-wall ]
          Hide
          Alex Rudyy added a comment -

          I reviewed the patch and the changes look fine for me.

          Show
          Alex Rudyy added a comment - I reviewed the patch and the changes look fine for me.
          Hide
          Keith Wall added a comment -

          Patch applied.

          Show
          Keith Wall added a comment - Patch applied.
          Keith Wall made changes -
          Status Ready To Review [ 10006 ] Resolved [ 5 ]
          Fix Version/s 0.19 [ 12322452 ]
          Resolution Fixed [ 1 ]
          Hide
          Justin Ross added a comment -

          Reviewed by Keith. Approved for 0.18.

          Show
          Justin Ross added a comment - Reviewed by Keith. Approved for 0.18.
          Hide
          Robbie Gemmell added a comment -

          Change merged to the 0.18 branch.

          Show
          Robbie Gemmell added a comment - Change merged to the 0.18 branch.
          Robbie Gemmell made changes -
          Assignee Keith Wall [ k-wall ] Robbie Gemmell [ gemmellr ]
          Fix Version/s 0.18 [ 12322451 ]
          Fix Version/s 0.19 [ 12322452 ]
          Affects Version/s 0.18 [ 12322451 ]
          Affects Version/s 0.19 [ 12322452 ]
          Rob Godfrey made changes -
          Component/s Java Broker BDB Store [ 12315809 ]
          Gavin made changes -
          Link This issue is related to QPID-4014 [ QPID-4014 ]
          Gavin made changes -
          Link This issue is related to QPID-4014 [ QPID-4014 ]
          Rob Godfrey made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open In Progress In Progress
          2d 21h 45m 1 Philip Harvey 01/Aug/12 11:45
          In Progress In Progress Reviewable Reviewable
          3m 2s 1 Philip Harvey 01/Aug/12 11:48
          Reviewable Reviewable Resolved Resolved
          2h 56m 1 Keith Wall 01/Aug/12 14:45
          Resolved Resolved Closed Closed
          924d 6h 21m 1 Rob Godfrey 11/Feb/15 20:07

            People

            • Assignee:
              Robbie Gemmell
              Reporter:
              Keith Wall
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development