Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.3.1
    • Fix Version/s: 5.4.1, 5.4.2
    • Component/s: Broker
    • Labels:
      None
    • Environment:

      Windows2k3, AMQ 5.3.1,

      Description

      I have two AMQ setup. And they use duplex network connection between. After I restart one ActiveMQ which initiates the connection, half of the message are missing. In order to avoid this problem, I have restart the other ActiveMQ. And this happens when I use "staticallyIncludedDestinations" or "dynamicallyIncludedDestinations" config in broker.

      1 SETUP:
      a) SCA server has a network connector to remote server114. In order to repeat this problem, you have to use "staticallyIncludedDestinations".

      <networkConnector name="SCA" uri="static://(https://192.168.3.114:61617)" duplex="true">
      <staticallyIncludedDestinations>
      <queue physicalName="R"/>
      </staticallyIncludedDestinations>
      </networkConnector>

      b) SCA server has a java code consumer listening on queue R:

      c) Remote server114 is listening on 61617, see config below;
      <transportConnectors>
      <transportConnector name="openwire" uri="tcp://0.0.0.0:61616"/>
      <transportConnector name="https" uri="https://0.0.0.0:61617?needClientAuth=true"/>
      </transportConnectors>
      (See attached picture "Remote-Console1.jpg".)

      2. Restart activemq on SCA server and restart consumer application listening on queue R on SCA too. Remote server114 activemq admin console shows there are two consumers on R.
      (See attached picture "Remote-Console2.jpg")

      3 Start a producer on remote server114 to send 10 messages to R. On SCA server, consumer on R only receives 5 messages.
      (See attached picture "SCA-consumer1.JPG")

      4. On remote server114 activemq admin console, these 10 messages are divided by these two consumers.
      (See attached picture "Remote-Console3.jpg")

      1. test-case.txt
        15 kB
        Stan Lewis
      2. test-case.txt
        9 kB
        Stan Lewis
      3. SCA-consumer1.JPG
        42 kB
        Qingyi Gu
      4. Remote-Console3.JPG
        57 kB
        Qingyi Gu
      5. Remote-Console2.JPG
        57 kB
        Qingyi Gu
      6. Remote-Console1.JPG
        56 kB
        Qingyi Gu

        Issue Links

          Activity

          Hide
          Lowry Curry added a comment -

          Steps to reproduce the issue
          ----------------------------------
          To reproduce this issue I installed apache-activemq-5.3.2 in two separate locations. The scenario simulates a hub and spoke architecture in which the hub (Broker1) will be connected (via a networkConnector) to Broker2. The network connector uses "http" rather than "tcp" for connection between brokers. This is a requirement for customer because customer broker connection must traverse distinct networks. Note the problem does not seem to occur when "tcp" is used. The networkConnector also defines a "staticallyIncludedDestination".

          The problem happens when Broker1 is shutdown and restarted. After that, the networkConnector (store and forward)
          mechanism seems to break. Messages that should be available to a consumer in queue on Broker1 are not sent to consumer
          and appear to be lost.

          The actual config details are given below. Also attached are the activemq.xml files for Broker1 and Broker2.

          1) Update activemq.xml for Broker1.
          The changes to the default configuration for Broker1 include the following:
          • Change the broker name from localhost to Broker1
          • Add networkConnector to Broker2.

          <broker xmlns="http://activemq.apache.org/schema/core" useJmx="true" brokerName="BROKER1" dataDirectory="$

          {activemq.base}
          /data" destroyApplicationContextOnStop="true" >

          ...
          ...
          <networkConnectors>
          <networkConnector name="SCA" uri="static://(http://localhost:61617)" duplex="true">
          <staticallyIncludedDestinations>
          <queue physicalName="R"/>
          </staticallyIncludedDestinations>
          </networkConnector>
          </networkConnectors>


          2) Update the activemq.xml for Broker2
          The changes to the default configuration for Broker2 include the following:
          • Change the broker name from localhost to Broker2
          • Change the default transport connector (tcp) to listen on 61618 (to avoid conflict with Broker1's tcp port)
          • Add an transport connector (http) for port 61617. This is the port that will be connected to by Broker1.

          <broker xmlns="http://activemq.apache.org/schema/core" useJmx="true" brokerName="BROKER2" dataDirectory="${activemq.base}

          /data" destroyApplicationContextOnStop="true">
          ...
          ...
          <transportConnectors>
          <transportConnector name="openwire" uri="tcp://0.0.0.0:61618"/>
          <transportConnector name="http" uri="http://0.0.0.0:61617"/>
          </transportConnectors>

          3) Start Broker2 using the activemq.bat script in Broker2's installation /bin directory.
          4) Start Broker1 using the activemq.bat script in Broker1's installation /bin directory.

          5) Run simple producer/consumer scenarios.
          a) Run consumer against 61616 (Broker1).
          Go to Broker1's example directory:
          E.g.
          C:\Progress\apache-activemq-5.3.2\example
          Run the following:

           ant consumer -Durl=tcp://localhost:61616 -Dmax=100 -Dsubject=R
          The above command will connect to Broker1 and listen for 100 messages on queue R on Broker1

          b) Next in another window - same directory - run the following:
           ant producer -Durl=tcp://localhost:61616 -Dmax=10 -Dsubject=R
          The above command will connect to Broker1 and send 10 messages to queue R on Broker1

          You should see the consumer will receive the 10 messages.
          c) Next, Run producer to send messages to Broker2:
           ant producer -Durl=tcp://localhost:61618 -Dmax=10 -Dsubject=R
          Because of the networkConnector set up for queue R between Broker1->Broker2, the consumer (that was already running and connected to Broker1) should also receive these 10 messages that were sent to queue R on Broker2.

          6) Shutdown Broker1 and restart.

          7) Run step 5 again. You will see when you run the producer the second time (step 5c) that not all the messages get to the consumer connected to queue R on Broker1. Only every other message arrive at consumer.

          You can shutdown Broker1 and repeat step 5 over and over and each time you will see fewer and fewer messages arrive at consumer connected to queue R on Broker1. Where do these messages go?

          Information from jconsole
          -------------------------------
          While running this test case, start jconsole and attach to each of the brokers.
          On Broker2 open the MBeans tab and expand activemq->BROKER2->Subscription->Non-Durable->Queue->R->NC_BROKER1_inbound_BROKER2
          This will show the subscription created on Broker2 by the networkConnector.
          Each time you shutdown Broker1 and restart, jconsole will show a new Subscription created in Broker2.

          Show
          Lowry Curry added a comment - Steps to reproduce the issue ---------------------------------- To reproduce this issue I installed apache-activemq-5.3.2 in two separate locations. The scenario simulates a hub and spoke architecture in which the hub (Broker1) will be connected (via a networkConnector) to Broker2. The network connector uses "http" rather than "tcp" for connection between brokers. This is a requirement for customer because customer broker connection must traverse distinct networks. Note the problem does not seem to occur when "tcp" is used. The networkConnector also defines a "staticallyIncludedDestination". The problem happens when Broker1 is shutdown and restarted. After that, the networkConnector (store and forward) mechanism seems to break. Messages that should be available to a consumer in queue on Broker1 are not sent to consumer and appear to be lost. The actual config details are given below. Also attached are the activemq.xml files for Broker1 and Broker2. 1) Update activemq.xml for Broker1. The changes to the default configuration for Broker1 include the following: • Change the broker name from localhost to Broker1 • Add networkConnector to Broker2. <broker xmlns="http://activemq.apache.org/schema/core" useJmx="true" brokerName="BROKER1" dataDirectory="$ {activemq.base} /data" destroyApplicationContextOnStop="true" > ... ... <networkConnectors> <networkConnector name="SCA" uri="static://( http://localhost:61617 )" duplex="true"> <staticallyIncludedDestinations> <queue physicalName="R"/> </staticallyIncludedDestinations> </networkConnector> </networkConnectors> 2) Update the activemq.xml for Broker2 The changes to the default configuration for Broker2 include the following: • Change the broker name from localhost to Broker2 • Change the default transport connector (tcp) to listen on 61618 (to avoid conflict with Broker1's tcp port) • Add an transport connector (http) for port 61617. This is the port that will be connected to by Broker1. <broker xmlns="http://activemq.apache.org/schema/core" useJmx="true" brokerName="BROKER2" dataDirectory="${activemq.base} /data" destroyApplicationContextOnStop="true"> ... ... <transportConnectors> <transportConnector name="openwire" uri="tcp://0.0.0.0:61618"/> <transportConnector name="http" uri="http://0.0.0.0:61617"/> </transportConnectors> 3) Start Broker2 using the activemq.bat script in Broker2's installation /bin directory. 4) Start Broker1 using the activemq.bat script in Broker1's installation /bin directory. 5) Run simple producer/consumer scenarios. a) Run consumer against 61616 (Broker1). Go to Broker1's example directory: E.g. C:\Progress\apache-activemq-5.3.2\example Run the following:  ant consumer -Durl=tcp://localhost:61616 -Dmax=100 -Dsubject=R The above command will connect to Broker1 and listen for 100 messages on queue R on Broker1 b) Next in another window - same directory - run the following:  ant producer -Durl=tcp://localhost:61616 -Dmax=10 -Dsubject=R The above command will connect to Broker1 and send 10 messages to queue R on Broker1 You should see the consumer will receive the 10 messages. c) Next, Run producer to send messages to Broker2:  ant producer -Durl=tcp://localhost:61618 -Dmax=10 -Dsubject=R Because of the networkConnector set up for queue R between Broker1->Broker2, the consumer (that was already running and connected to Broker1) should also receive these 10 messages that were sent to queue R on Broker2. 6) Shutdown Broker1 and restart. 7) Run step 5 again. You will see when you run the producer the second time (step 5c) that not all the messages get to the consumer connected to queue R on Broker1. Only every other message arrive at consumer. You can shutdown Broker1 and repeat step 5 over and over and each time you will see fewer and fewer messages arrive at consumer connected to queue R on Broker1. Where do these messages go? Information from jconsole ------------------------------- While running this test case, start jconsole and attach to each of the brokers. On Broker2 open the MBeans tab and expand activemq->BROKER2->Subscription->Non-Durable->Queue->R->NC_BROKER1_inbound_BROKER2 This will show the subscription created on Broker2 by the networkConnector. Each time you shutdown Broker1 and restart, jconsole will show a new Subscription created in Broker2.
          Hide
          Stan Lewis added a comment - - edited

          I've put together a test case (attached test-case.txt) similar to the NetworkReconnectTest that specifically goes through the steps in Lowry's description but I can't seem to reproduce this on trunk. Have you tried this on the 5.4 release or on the latest snapshot?

          Show
          Stan Lewis added a comment - - edited I've put together a test case (attached test-case.txt) similar to the NetworkReconnectTest that specifically goes through the steps in Lowry's description but I can't seem to reproduce this on trunk. Have you tried this on the 5.4 release or on the latest snapshot?
          Hide
          Qingyi Gu added a comment -

          I just tested with AMQ5.4.0 release, the problem still exists. I looked at your activemq config in your test case. I think you have broker1 and broker2 reversed based on Lowry's description. There is one little typo in Lowry's comment. In his first step "1) Update activemq.xml for Broker1.", it should be "Add networkConnector to Broker1", not "Add networkConnector to Broker2.".

          Show
          Qingyi Gu added a comment - I just tested with AMQ5.4.0 release, the problem still exists. I looked at your activemq config in your test case. I think you have broker1 and broker2 reversed based on Lowry's description. There is one little typo in Lowry's comment. In his first step "1) Update activemq.xml for Broker1.", it should be "Add networkConnector to Broker1", not "Add networkConnector to Broker2.".
          Hide
          Stan Lewis added a comment -

          Thanks for that, have re-worked the test case and it does reproduce the issue now, have attached an update.

          Show
          Stan Lewis added a comment - Thanks for that, have re-worked the test case and it does reproduce the issue now, have attached an update.
          Hide
          Gary Tully added a comment -

          it looks like due to http being async the bridge does not get torn down when the remote broker shutsdown as is the case with a tcp connection. With tcp there is always an outstanding read that will fail. It looks like the failed read is is being retried in the http case.
          It may be a case of configuring retry semantics for http client but I think it may need some duplicate detection of network connectors based on name or something. need to research it a bit.

          Show
          Gary Tully added a comment - it looks like due to http being async the bridge does not get torn down when the remote broker shutsdown as is the case with a tcp connection. With tcp there is always an outstanding read that will fail. It looks like the failed read is is being retried in the http case. It may be a case of configuring retry semantics for http client but I think it may need some duplicate detection of network connectors based on name or something. need to research it a bit.
          Hide
          Gary Tully added a comment -

          fix in r990107

          InactivityMonitor added to http transport by default. May be disabled via query parameters useInactivityMonitor=false for clients and transport.useInactivityMonitor=false on the broker side.
          The default 30second timeout can be configured via url query params:

          "http://localhost:61617?transport.readCheckTime=4000&amp;transport.initialDelayTime=4000"

          For a duplex network connector, these are applied to the target broker http listener so that they apply to the return network connector transport connection.
          test case included, was a great help, thanks.

          Show
          Gary Tully added a comment - fix in r990107 InactivityMonitor added to http transport by default. May be disabled via query parameters useInactivityMonitor=false for clients and transport.useInactivityMonitor=false on the broker side. The default 30second timeout can be configured via url query params: "http: //localhost:61617?transport.readCheckTime=4000&amp;transport.initialDelayTime=4000" For a duplex network connector, these are applied to the target broker http listener so that they apply to the return network connector transport connection. test case included, was a great help, thanks.
          Hide
          Dejan Bosanac added a comment -

          An additional fix that enables transport options to persist connector restart has been committed with svn revision 1005033 and will be available in 5.5.0

          Show
          Dejan Bosanac added a comment - An additional fix that enables transport options to persist connector restart has been committed with svn revision 1005033 and will be available in 5.5.0

            People

            • Assignee:
              Gary Tully
              Reporter:
              Qingyi Gu
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development