Qpid
  1. Qpid
  2. QPID-4541

Messages are replayed after XA commit in a failover scenario

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: Future
    • Fix Version/s: 0.21
    • Component/s: Java Client, JCA
    • Labels:
      None
    • Environment:

      All supported OS platforms, all supported JEE platforms

      Description

      Currently we are seeing an issue in failover scenarios where messages that are received and the XA transaction completes are replayed on the nth node of a cluster when failover is invoked. This appears to be similar to QPID-2294 in the violation of transaction atomicity.

        Activity

        Weston M. Price created issue -
        Hide
        Rajith Attapattu added a comment -

        The XA Session is using AUTO_ACK mode hence by default messages replayed when the session is recreated.
        The solution to this is to allow message replay to be turned on/off using a boolean flag when creating a session.

        This solution could be useful outside of the context of this bug as in some cases the client will not worry about message loss, hence does not need to keep messages in it's replay buffer which can have an affect on memory consumption and eventually slowing down the client.

        Show
        Rajith Attapattu added a comment - The XA Session is using AUTO_ACK mode hence by default messages replayed when the session is recreated. The solution to this is to allow message replay to be turned on/off using a boolean flag when creating a session. This solution could be useful outside of the context of this bug as in some cases the client will not worry about message loss, hence does not need to keep messages in it's replay buffer which can have an affect on memory consumption and eventually slowing down the client.
        Hide
        Keith Wall added a comment -

        Hi Rajith, Weston,

        I've spent sometime looking at the patch tonight, but I'm lacking some background to review the patch properly.

        I need some more details about the sequence of events that allow this defect to occur. Perhaps if you could summarise what is happening (at protocol level) between the client and the two brokers (the original broker and the failover).

        Also, how does this defect manifest itself to the end-user application? What are they seeing? If there is a stack trace please attach it here.

        Thanks, Keith.

        Show
        Keith Wall added a comment - Hi Rajith, Weston, I've spent sometime looking at the patch tonight, but I'm lacking some background to review the patch properly. I need some more details about the sequence of events that allow this defect to occur. Perhaps if you could summarise what is happening (at protocol level) between the client and the two brokers (the original broker and the failover). Also, how does this defect manifest itself to the end-user application? What are they seeing? If there is a stack trace please attach it here. Thanks, Keith.
        Hide
        Rajith Attapattu added a comment -

        Kieth, I would let Weston to explain the exact use case from an XA pov, as he has more details and knowledge about the use case.

        Instead I will try to explain the root cause and the fix I made.

        By default the JMS client will replay unacked message transfers (after failover) in it's replay buffer unless the session is marked transacted.
        (If we allow replay in a transacted session, then we will have messages produced or consumed outside of the transaction boundaries, which violates atomocity).

        For XA transactions the same applies. If we replay messages after failover it will result in messages consumed or produced outside of the transaction boundaries.
        However the XA implementation does not use a transacted JMS session underneath as the transaction is governed by the XA manager.
        So it uses an AUTO_ACKED session underneath, which replays message transfers after failover.

        The patch I submitted allows this to be turned off based on a boolean argument.

        Does that make sense? If it's still not clear please feel free to ask further questions.

        Rajith.

        Show
        Rajith Attapattu added a comment - Kieth, I would let Weston to explain the exact use case from an XA pov, as he has more details and knowledge about the use case. Instead I will try to explain the root cause and the fix I made. By default the JMS client will replay unacked message transfers (after failover) in it's replay buffer unless the session is marked transacted. (If we allow replay in a transacted session, then we will have messages produced or consumed outside of the transaction boundaries, which violates atomocity). For XA transactions the same applies. If we replay messages after failover it will result in messages consumed or produced outside of the transaction boundaries. However the XA implementation does not use a transacted JMS session underneath as the transaction is governed by the XA manager. So it uses an AUTO_ACKED session underneath, which replays message transfers after failover. The patch I submitted allows this to be turned off based on a boolean argument. Does that make sense? If it's still not clear please feel free to ask further questions. Rajith.
        Hide
        Rajith Attapattu added a comment -

        The following commit fixes this issue.
        http://svn.apache.org/viewvc?rev=1438725&view=rev

        Show
        Rajith Attapattu added a comment - The following commit fixes this issue. http://svn.apache.org/viewvc?rev=1438725&view=rev
        Rajith Attapattu made changes -
        Field Original Value New Value
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 0.21 [ 12323549 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            Rajith Attapattu
            Reporter:
            Weston M. Price
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development