ActiveMQ
  1. ActiveMQ
  2. AMQ-2114

Failover transport should not hang on startup if it cannot connect

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Won't Fix
    • Affects Version/s: 5.2.0
    • Fix Version/s: 5.3.2
    • Component/s: Transport
    • Labels:
      None
    • Environment:

      Sun Java 1.6.0_12
      Fedora Linux 10
      ActiveMQ 5.2.0

      Description

      When connecting with a failover transport, like the DEFAULT_BROKER_URL, the transport hangs on connection.start() if it cannot connect to the remote broker. It should return normally.

      This only happens on startup. Later disconnects behave nicely.

        Activity

        Uwe Kubosch created issue -
        Hide
        Dejan Bosanac added a comment -

        Can you post an URL you are using and turn on debugging on FailoverTransport to catch the exception thrown by transport?

        Show
        Dejan Bosanac added a comment - Can you post an URL you are using and turn on debugging on FailoverTransport to catch the exception thrown by transport?
        Hide
        Uwe Kubosch added a comment -

        Will do.

        Show
        Uwe Kubosch added a comment - Will do.
        Hide
        Uwe Kubosch added a comment -

        I am using ActiveMQConnection.DEFAULT_BROKER_URL

        Here is the debug output:

        0 [main] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Reconnect was triggered but transport is not started yet. Wait for start to connect the transport.
        12 [main] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Started.
        13 [main] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Waking up reconnect task
        18 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Attempting connect to: tcp://localhost:61616
        DEBUG ConnectionProcessor: connection OK
        149 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Connect fail to: tcp://localhost:61616, reason: java.net.ConnectException: Connection refused
        149 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Waiting 10 ms before attempting connection.
        159 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Attempting connect to: tcp://localhost:61616
        160 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Connect fail to: tcp://localhost:61616, reason: java.net.ConnectException: Connection refused
        160 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Waiting 20 ms before attempting connection.
        183 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Attempting connect to: tcp://localhost:61616
        184 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Connect fail to: tcp://localhost:61616, reason: java.net.ConnectException: Connection refused
        184 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Waiting 40 ms before attempting connection.
        225 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Attempting connect to: tcp://localhost:61616
        241 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Connect fail to: tcp://localhost:61616, reason: java.net.ConnectException: Connection refused

        Show
        Uwe Kubosch added a comment - I am using ActiveMQConnection.DEFAULT_BROKER_URL Here is the debug output: 0 [main] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Reconnect was triggered but transport is not started yet. Wait for start to connect the transport. 12 [main] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Started. 13 [main] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Waking up reconnect task 18 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Attempting connect to: tcp://localhost:61616 DEBUG ConnectionProcessor: connection OK 149 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Connect fail to: tcp://localhost:61616, reason: java.net.ConnectException: Connection refused 149 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Waiting 10 ms before attempting connection. 159 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Attempting connect to: tcp://localhost:61616 160 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Connect fail to: tcp://localhost:61616, reason: java.net.ConnectException: Connection refused 160 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Waiting 20 ms before attempting connection. 183 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Attempting connect to: tcp://localhost:61616 184 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Connect fail to: tcp://localhost:61616, reason: java.net.ConnectException: Connection refused 184 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Waiting 40 ms before attempting connection. 225 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Attempting connect to: tcp://localhost:61616 241 [ActiveMQ Task] DEBUG org.apache.activemq.transport.failover.FailoverTransport - Connect fail to: tcp://localhost:61616, reason: java.net.ConnectException: Connection refused
        Hide
        Dejan Bosanac added a comment -

        Hi Uwe,

        this is exactly what failover transport should do (try reconnecting until broker becomes available). If you want to limit the number of retries try using the maxReconnectAttempts option, such as

        failover:(tcp://localhost:61616)?maxReconnectAttempts=3

        Show
        Dejan Bosanac added a comment - Hi Uwe, this is exactly what failover transport should do (try reconnecting until broker becomes available). If you want to limit the number of retries try using the maxReconnectAttempts option, such as failover:(tcp://localhost:61616)?maxReconnectAttempts=3
        Dejan Bosanac made changes -
        Field Original Value New Value
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Uwe Kubosch added a comment - - edited

        Yes, I would like failover transport to try reconnecting until the broker becomes available, BUT I would like it not to block! The failover transport does NOT block on later calls even if there is no connection to the broker, so why should it block on startup if the broker is down? I do not want to configure maxReconnectAttempts since I want the transport to keep trying, forever.

        An example is when you configure jmsBridgeConnectors with a jmsQueueConnector using a connection factory with a failover transport. If the remote broker is not available, the local ActiveMQ instance will never finish starting up! This cannot be what you intend?!

        I would expect the failover transport to start, and keep trying to connect to the remote broker.

        Show
        Uwe Kubosch added a comment - - edited Yes, I would like failover transport to try reconnecting until the broker becomes available, BUT I would like it not to block! The failover transport does NOT block on later calls even if there is no connection to the broker, so why should it block on startup if the broker is down? I do not want to configure maxReconnectAttempts since I want the transport to keep trying, forever. An example is when you configure jmsBridgeConnectors with a jmsQueueConnector using a connection factory with a failover transport. If the remote broker is not available, the local ActiveMQ instance will never finish starting up! This cannot be what you intend?! I would expect the failover transport to start, and keep trying to connect to the remote broker.
        Hide
        Dejan Bosanac added a comment -

        Hi,

        I just investigated this a bit further. The problem is that connection.start() tries to send a ConnectionInfo packet to the broker to establish the connection. When the connection is established successfully, the succeeding start() calls will not do this, so it may appear that it behaves differently in that case.

        Anyhow, this is crucial step in establishing a connection, so it has to be done synchronously, because all later operations (creating a sessions, consumers, etc.) assumes the valid connection. Perhaps the issue you described with jmsQueueConnector and failover transport should be tackled somewhere else (not the connection.start() procedure).

        Show
        Dejan Bosanac added a comment - Hi, I just investigated this a bit further. The problem is that connection.start() tries to send a ConnectionInfo packet to the broker to establish the connection. When the connection is established successfully, the succeeding start() calls will not do this, so it may appear that it behaves differently in that case. Anyhow, this is crucial step in establishing a connection, so it has to be done synchronously, because all later operations (creating a sessions, consumers, etc.) assumes the valid connection. Perhaps the issue you described with jmsQueueConnector and failover transport should be tackled somewhere else (not the connection.start() procedure).
        Hide
        Peter Voss added a comment -

        The resolution of this issue has been set to fixed. If that's really true, what's the fix version then?

        Show
        Peter Voss added a comment - The resolution of this issue has been set to fixed. If that's really true, what's the fix version then?
        Hide
        Rafal N added a comment -

        Same as above. Is it really fixed? In which revision/version?

        Show
        Rafal N added a comment - Same as above. Is it really fixed? In which revision/version?
        Hide
        Julio Faerman added a comment -

        I am also having this problem on 5.3.0. I think this sould be re-opened....

        Show
        Julio Faerman added a comment - I am also having this problem on 5.3.0. I think this sould be re-opened....
        Julio Faerman made changes -
        Status Resolved [ 5 ] Reopened [ 4 ]
        Resolution Fixed [ 1 ]
        Hide
        Roman Schmidmeir added a comment -

        same problem here on 5.3.0...
        A manual workaround is pretty ugly.

        Show
        Roman Schmidmeir added a comment - same problem here on 5.3.0... A manual workaround is pretty ugly.
        Rob Davies made changes -
        Assignee Rob Davies [ rajdavies ]
        Rob Davies made changes -
        Fix Version/s 5.4.1 [ 12332 ]
        Hide
        Rob Davies added a comment -

        This is working as designed - i.e. it has to block on starting a connection. If you are publishing a message and a failure occurs - the connection would block too. To get the behaviour were the connection doesn't block - you need to use an embedded broker with a network connection to do store and forward. So instead of creating a connection factory with a URL of:

        failover://(tcp://remotebroker:61616)

        you could use a broker URI - e.g.

        vm:(broker:(tcp://localhost:61616,network:static:tcp://remotehost:61616))

        see http://activemq.apache.org/broker-uri.html for more details

        Show
        Rob Davies added a comment - This is working as designed - i.e. it has to block on starting a connection. If you are publishing a message and a failure occurs - the connection would block too. To get the behaviour were the connection doesn't block - you need to use an embedded broker with a network connection to do store and forward. So instead of creating a connection factory with a URL of: failover://(tcp://remotebroker:61616) you could use a broker URI - e.g. vm:(broker:(tcp://localhost:61616,network:static:tcp://remotehost:61616)) see http://activemq.apache.org/broker-uri.html for more details
        Rob Davies made changes -
        Fix Version/s 5.4.1 [ 12332 ]
        Fix Version/s 5.3.2 [ 12310 ]
        Resolution Won't Fix [ 2 ]
        Status Reopened [ 4 ] Resolved [ 5 ]
        Hide
        Alec Bickerton added a comment -

        As I see it.

        If you use vm:(broker:(tcp://localhost:61616,network:static:tcp://remotehost:61616)) then you're explicitly not using the failoverprotocol. IMHO, this should be re-opened and properly fixed.

        Show
        Alec Bickerton added a comment - As I see it. If you use vm:(broker:(tcp://localhost:61616,network:static:tcp://remotehost:61616)) then you're explicitly not using the failoverprotocol. IMHO, this should be re-opened and properly fixed.
        Jeff Turner made changes -
        Project Import Fri Nov 26 22:32:02 EST 2010 [ 1290828722158 ]

          People

          • Assignee:
            Rob Davies
            Reporter:
            Uwe Kubosch
          • Votes:
            4 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development