UIMA
  1. UIMA
  2. UIMA-2251

UIMA AS aggregate disables broker connection

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.4.0AS
    • Component/s: Async Scaleout
    • Labels:
      None

      Description

      One of the users reported a problem which seems like a bug in the UIMA AS code which detects status of a broker connection. The user provided description follows:

      I have an aggregate AE with a remote primitive (OntoAnnotator). Both have
      their queues at the same broker. Clients send requests to the aggregate
      using the sendCAS() method.

      This was running fine for about 5-6 hours, but then the aggregate logged an
      error:
      11/10/04 02:00:12 INFO cpe.DynamicFlowController$DynamicFlow: Next Executing
      Annotator :: OntoAnnotator
      11/10/04 02:00:12 INFO activemq.JmsOutputChannel: Controller
      AnalysisAggregator Invalidating JMS Connection To Broker
      tcp://broker_ip:61616 and Closing Sessions To Delegates

      It had received 4-5 timeouts from the remote delegate over time, but at
      least a couple of hours before the above log. Both the broker and the remote
      delegate were still running and had not crashed.

      The aggregate continued processing requests after that – the CASes are
      processed by all collocated primitives but not the remote one. Each CAS
      process request gets an exception:
      11/10/04 02:00:12 WARN
      activemq.JmsEndpointConnection_impl:
      org.apache.uima.aae.error.DelegateConnectionLostException:
      Controller:AnalysisAggregator Lost Connection to Delegate:OntoAnnotator
      at
      org.apache.uima.adapter.jms.activemq.JmsEndpointConnection_impl.send(JmsEndpointConnection_impl.java:547)
      at
      org.apache.uima.adapter.jms.activemq.JmsEndpointConnection_impl.send(JmsEndpointConnection_impl.java:509)
      at
      org.apache.uima.adapter.jms.activemq.JmsOutputChannel.dispatch(JmsOutputChannel.java:1366)
      at
      org.apache.uima.adapter.jms.activemq.JmsOutputChannel.sendCasToRemoteEndpoint(JmsOutputChannel.java:1527)
      at
      org.apache.uima.adapter.jms.activemq.JmsOutputChannel.serializeCasAndSend(JmsOutputChannel.java:658)
      at
      org.apache.uima.adapter.jms.activemq.JmsOutputChannel.sendRequest(JmsOutputChannel.java:610)
      at
      org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.dispatch(AggregateAnalysisEngineController_impl.java:2395)
      at
      org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.dispatchProcessRequest(AggregateAnalysisEngineController_impl.java:2435)
      at
      org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.simpleStep(AggregateAnalysisEngineController_impl.java:1295)
      at
      org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.executeFlowStep(AggregateAnalysisEngineController_impl.java:2316)
      at
      org.apache.uima.aae.controller.AggregateAnalysisEngineController_impl.process(AggregateAnalysisEngineController_impl.java:1230)
      at
      org.apache.uima.aae.handler.HandlerBase.invokeProcess(HandlerBase.java:118)
      at
      org.apache.uima.aae.handler.input.ProcessResponseHandler.cancelTimerAndProcess(ProcessResponseHandler.java:108)

      When I tried stopping the aggregate, the logs said the following though
      there was no CAS request in process:
      11/10/04 10:18:18 WARN service.UIMA_Service: Uima AS Service
      AnalysisAggregator Caught Kill Signal - Initiating Quiesce and Stop
      11/10/04 10:18:18 INFO controller.BaseAnalysisEngineController: Stopping
      Controller: AnalysisAggregator
      11/10/04 10:18:18 INFO activemq.JmsInputChannel: Stopping Service JMS
      Transport. Service: q_async_ae
      11/10/04 10:18:18 INFO activemq.JmsInputChannel: Controller:
      AnalysisAggregator Stopped Listener on Endpoint: queue://q_async_ae
      Selector: Selector:Command=2000 OR Command=2002.
      11/10/04 10:18:18 INFO activemq.JmsInputChannel: Stopping Service JMS
      Transport. Service: q_async_ae
      11/10/04 10:18:18 INFO activemq.JmsInputChannel: Controller:
      AnalysisAggregator Stopped Listener on Endpoint: queue://q_async_ae
      Selector: Selector:Command=2001.
      11/10/04 10:18:18 INFO controller.BaseAnalysisEngineController: Controller:
      AnalysisAggregator Registering onEmpty Callback With InProcessCache.
      11/10/04 10:18:18 INFO controller.BaseAnalysisEngineController: Controller:
      AnalysisAggregator Awaiting onEmpty Callback From InProcessCache

      After restarting just the aggregate, it connected to the remote AE just
      fine. So i'm wondering why the aggregate decided to stop communicating with
      it earlier?

      I've seen a previous thread with a similar error (
      http://thread.gmane.org/gmane.comp.apache.uima.general/3351/focus=3388), but
      there the broker was wilfully taken down, whereas I did no such thing.

      Thanks, and sorry for the barrage of info.

      Meghana

        Activity

        Hide
        Jerry Cwiklik added a comment -

        Meghana, check if your aggregate is configured to disable remote delegate after N errors/timouts. Not sure why the connection is lost. Perhaps network glitch. The code is checking state of a connection and if the connection.isOpen() returns false it is assumed that the connection is bad. That is a recoverable condition. On a subsequent attempt to dispatch, the Aggregate tries to reopen the connection. In your case, the subsequent CAS fails to dispatch because the endpoint seems to be disabled. I need to know if the disable happens due to aggregate configuration

        Show
        Jerry Cwiklik added a comment - Meghana, check if your aggregate is configured to disable remote delegate after N errors/timouts. Not sure why the connection is lost. Perhaps network glitch. The code is checking state of a connection and if the connection.isOpen() returns false it is assumed that the connection is bad. That is a recoverable condition. On a subsequent attempt to dispatch, the Aggregate tries to reopen the connection. In your case, the subsequent CAS fails to dispatch because the endpoint seems to be disabled. I need to know if the disable happens due to aggregate configuration
        Hide
        Jerry Cwiklik added a comment -

        One more question regarding this problem. Its a long shot, but do you still have the log from the failed run? It would be helpful to see if there were other exceptions preceding the problem. Specifically, did you get an exception in aggregate code when trying to send a request to the remote delegate?

        Show
        Jerry Cwiklik added a comment - One more question regarding this problem. Its a long shot, but do you still have the log from the failed run? It would be helpful to see if there were other exceptions preceding the problem. Specifically, did you get an exception in aggregate code when trying to send a request to the remote delegate?
        Hide
        Meghana Marathe added a comment -

        Hi Jerry,

        I don't think i've configured disabling after N errors/timeouts. This is what my aggregate deployment descriptor looks like:

        <?xml version="1.0" encoding="UTF-8"?>
        <analysisEngineDeploymentDescription
        xmlns="http://uima.apache.org/resourceSpecifier">
        <name>AsyncAggregate</name>

        <deployment protocol="jms" provider="activemq">
        <casPool numberOfCASes="2" />
        <service>
        <inputQueue endpoint="q_async_ae" brokerURL="tcp://

        {MASTER}:61616" />
        <topDescriptor>
        <import name="analysis_engine.aggregate.AnalysisAggregator" />
        </topDescriptor>
        <analysisEngine key="AnalysisAggregator" async="true">
        <delegates>
        <remoteAnalysisEngine key="OntoAnnotator">
        <inputQueue endpoint="OntoAnnotatorQueue" brokerURL="tcp://{MASTER}

        :61616"/>
        <serializer method="xmi"/>
        <asyncAggregateErrorConfiguration>
        <processCasErrors timeout="500000" continueOnRetryFailure="true" />
        </asyncAggregateErrorConfiguration>
        </remoteAnalysisEngine>
        </delegates>
        </analysisEngine>
        </service>
        </deployment>
        </analysisEngineDeploymentDescription>

        Please let me know whether there's any other default/etc that enables disabling on N errors/timeouts.

        Show
        Meghana Marathe added a comment - Hi Jerry, I don't think i've configured disabling after N errors/timeouts. This is what my aggregate deployment descriptor looks like: <?xml version="1.0" encoding="UTF-8"?> <analysisEngineDeploymentDescription xmlns="http://uima.apache.org/resourceSpecifier"> <name>AsyncAggregate</name> <deployment protocol="jms" provider="activemq"> <casPool numberOfCASes="2" /> <service> <inputQueue endpoint="q_async_ae" brokerURL="tcp:// {MASTER}:61616" /> <topDescriptor> <import name="analysis_engine.aggregate.AnalysisAggregator" /> </topDescriptor> <analysisEngine key="AnalysisAggregator" async="true"> <delegates> <remoteAnalysisEngine key="OntoAnnotator"> <inputQueue endpoint="OntoAnnotatorQueue" brokerURL="tcp://{MASTER} :61616"/> <serializer method="xmi"/> <asyncAggregateErrorConfiguration> <processCasErrors timeout="500000" continueOnRetryFailure="true" /> </asyncAggregateErrorConfiguration> </remoteAnalysisEngine> </delegates> </analysisEngine> </service> </deployment> </analysisEngineDeploymentDescription> Please let me know whether there's any other default/etc that enables disabling on N errors/timeouts.
        Hide
        Meghana Marathe added a comment -

        I'd saved the logs, so here are the errors i saw:

        11/10/04 00:54:41 WARN activemq.UimaDefaultMessageListenerContainer: Jms Listener Failed. Endpoint: q_async_ae Managed By: tcp://broker_ip
        :61616 Reason: javax.jms.JMSException: java.io.EOFException
        javax.jms.JMSException: Channel was inactive for too long: broker_ip/192.168.0.92:61616
        at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:62)
        at org.apache.activemq.ActiveMQConnection.doAsyncSendPacket(ActiveMQConnection.java:1259)
        11/10/04 00:54:41 WARN activemq.UimaDefaultMessageListenerContainer: Uima AS Service:AnalysisAggregator Listener Unable To Connect To Broker
        : tcp://broker_ip:61616 Retrying ...
        11/10/04 00:54:42 WARN activemq.UimaDefaultMessageListenerContainer: Uima AS Service:AnalysisAggregator Listener Established Connection to Broker: tcp://broker_ip:61616
        ...
        ...
        11/10/04 02:00:06 WARN activemq.UimaDefaultMessageListenerContainer: Jms Listener Failed. Endpoint: q_async_ae Managed By: tcp://broker_ip
        :61616 Reason: javax.jms.JMSException: Channel was inactive for too long: broker_ip/192.168.0.92:61616
        javax.jms.JMSException: Channel was inactive for too long: broker_ip/192.168.0.92:61616
        at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:62)
        at org.apache.activemq.ActiveMQConnection.doAsyncSendPacket(ActiveMQConnection.java:1259)
        11/10/04 02:00:06 INFO activemq.JmsInputChannel: Stopping Listener On Endpoint: temp-queue://ID:client-ip-47414-1317650735327-0:0:1
        11/10/04 02:00:06 WARN activemq.UimaDefaultMessageListenerContainer: Uima AS Service:AnalysisAggregator Listener Unable To Connect To Broker
        : tcp://broker_ip:61616 Retrying ...
        11/10/04 02:00:06 WARN activemq.UimaDefaultMessageListenerContainer: Uima AS Service:AnalysisAggregator Listener Established Connection to B
        roker: tcp://germinait22:61616
        ...
        ...
        11/10/04 02:00:12 INFO activemq.JmsOutputChannel: Controller AnalysisAggregator Invalidating JMS Connection To Broker tcp://broker_ip:6161
        6 and Closing Sessions To Delegates
        11/10/04 02:00:12 WARN activemq.JmsEndpointConnection_impl: Service: AnalysisAggregator Runtime Exception
        11/10/04 02:00:12 WARN activemq.JmsEndpointConnection_impl:
        org.apache.uima.aae.error.DelegateConnectionLostException: Controller:AnalysisAggregator Lost Connection to Delegate:OntoAnnotator
        at org.apache.uima.adapter.jms.activemq.JmsEndpointConnection_impl.send(JmsEndpointConnection_impl.java:547)
        at org.apache.uima.adapter.jms.activemq.JmsEndpointConnection_impl.send(JmsEndpointConnection_impl.java:509)

        I didn't forcibly stop the broker. I tried the same deployment again the next day (to account for network glitches) and it too failed with similar errors. I'd really like to be able to make the delegate remote, if possible.

        Thanks.

        Show
        Meghana Marathe added a comment - I'd saved the logs, so here are the errors i saw: 11/10/04 00:54:41 WARN activemq.UimaDefaultMessageListenerContainer: Jms Listener Failed. Endpoint: q_async_ae Managed By: tcp://broker_ip :61616 Reason: javax.jms.JMSException: java.io.EOFException javax.jms.JMSException: Channel was inactive for too long: broker_ip/192.168.0.92:61616 at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:62) at org.apache.activemq.ActiveMQConnection.doAsyncSendPacket(ActiveMQConnection.java:1259) 11/10/04 00:54:41 WARN activemq.UimaDefaultMessageListenerContainer: Uima AS Service:AnalysisAggregator Listener Unable To Connect To Broker : tcp://broker_ip:61616 Retrying ... 11/10/04 00:54:42 WARN activemq.UimaDefaultMessageListenerContainer: Uima AS Service:AnalysisAggregator Listener Established Connection to Broker: tcp://broker_ip:61616 ... ... 11/10/04 02:00:06 WARN activemq.UimaDefaultMessageListenerContainer: Jms Listener Failed. Endpoint: q_async_ae Managed By: tcp://broker_ip :61616 Reason: javax.jms.JMSException: Channel was inactive for too long: broker_ip/192.168.0.92:61616 javax.jms.JMSException: Channel was inactive for too long: broker_ip/192.168.0.92:61616 at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:62) at org.apache.activemq.ActiveMQConnection.doAsyncSendPacket(ActiveMQConnection.java:1259) 11/10/04 02:00:06 INFO activemq.JmsInputChannel: Stopping Listener On Endpoint: temp-queue://ID:client-ip-47414-1317650735327-0:0:1 11/10/04 02:00:06 WARN activemq.UimaDefaultMessageListenerContainer: Uima AS Service:AnalysisAggregator Listener Unable To Connect To Broker : tcp://broker_ip:61616 Retrying ... 11/10/04 02:00:06 WARN activemq.UimaDefaultMessageListenerContainer: Uima AS Service:AnalysisAggregator Listener Established Connection to B roker: tcp://germinait22:61616 ... ... 11/10/04 02:00:12 INFO activemq.JmsOutputChannel: Controller AnalysisAggregator Invalidating JMS Connection To Broker tcp://broker_ip:6161 6 and Closing Sessions To Delegates 11/10/04 02:00:12 WARN activemq.JmsEndpointConnection_impl: Service: AnalysisAggregator Runtime Exception 11/10/04 02:00:12 WARN activemq.JmsEndpointConnection_impl: org.apache.uima.aae.error.DelegateConnectionLostException: Controller:AnalysisAggregator Lost Connection to Delegate:OntoAnnotator at org.apache.uima.adapter.jms.activemq.JmsEndpointConnection_impl.send(JmsEndpointConnection_impl.java:547) at org.apache.uima.adapter.jms.activemq.JmsEndpointConnection_impl.send(JmsEndpointConnection_impl.java:509) I didn't forcibly stop the broker. I tried the same deployment again the next day (to account for network glitches) and it too failed with similar errors. I'd really like to be able to make the delegate remote, if possible. Thanks.
        Hide
        Jerry Cwiklik added a comment -

        When jms send() fails, the delegate is marked as FAILED and its listener is stopped. Subsequent attempt to send a CAS to the delegate initiates full recovery of the listener on new temp queue. The code was only performing partial recovery, failing to restart the listener for the delegate and never changing the state of the delegate to OK which resulted in a Delegate Connection Lost exception.

        Show
        Jerry Cwiklik added a comment - When jms send() fails, the delegate is marked as FAILED and its listener is stopped. Subsequent attempt to send a CAS to the delegate initiates full recovery of the listener on new temp queue. The code was only performing partial recovery, failing to restart the listener for the delegate and never changing the state of the delegate to OK which resulted in a Delegate Connection Lost exception.
        Hide
        Meghana Marathe added a comment -

        Hi Jerry,

        So do I need to update my UIMA code to this version so that the error doesn't occur? Or is there any other work-around you would suggest (in case the release isn't stable, etc)?

        Thanks,

        meghana

        Show
        Meghana Marathe added a comment - Hi Jerry, So do I need to update my UIMA code to this version so that the error doesn't occur? Or is there any other work-around you would suggest (in case the release isn't stable, etc)? Thanks, meghana

          People

          • Assignee:
            Jerry Cwiklik
            Reporter:
            Jerry Cwiklik
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development