Synapse
  1. Synapse
  2. SYNAPSE-205

NPE in HttpCoreNIOSender and "I/O reactor has been shut down"

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: NIGHTLY
    • Fix Version/s: 1.1.1
    • Component/s: Transports
    • Labels:
      None

      Description

      When the target service is not available, a request to the proxy service causes the following exception:

      Exception in thread "HttpCoreNIOSender" java.lang.NullPointerException
      at org.apache.synapse.transport.nhttp.HttpCoreNIOSender$3.handleError(HttpCoreNIOSender.java:460)
      at org.apache.synapse.transport.nhttp.HttpCoreNIOSender$3.timeout(HttpCoreNIOSender.java:439)
      at org.apache.http.impl.nio.reactor.SessionRequestImpl.timeout(SessionRequestImpl.java:151)
      at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processTimeouts(DefaultConnectingIOReactor.java:152)
      at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:96)
      at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:158)
      at org.apache.synapse.transport.nhttp.HttpCoreNIOSender.executeClientEngine(HttpCoreNIOSender.java:139)
      at org.apache.synapse.transport.nhttp.HttpCoreNIOSender.access$000(HttpCoreNIOSender.java:68)
      at org.apache.synapse.transport.nhttp.HttpCoreNIOSender$1.run(HttpCoreNIOSender.java:101)
      at java.lang.Thread.run(Thread.java:613)

      Any further request then fails with "java.lang.IllegalStateException: I/O reactor has been shut down".

      The instruction in HttpCoreNIOSender that causes the NPE is as follows:

      MessageContext nioFaultMessageContext =
      MessageContextBuilder.createFaultMessageContext(
      mc, new AxisFault(exception.toString(), exception));

      Probably, when handleError is called by the timeout (rather than the failed) method, as is the case here (see stacktrace), exception is null. The handleError method doesn't handle this situation appropriately.

      Note that this issue is similar but not identical to the one described in SYNAPSE-168.

      1. synapse-205.patch.txt
        2 kB
        Andreas Veithen
      2. synapse-205-with-cancel.patch.txt
        3 kB
        Andreas Veithen

        Issue Links

          Activity

          Hide
          Asankha C. Perera added a comment -

          Andreas

          After I fixed 168 and 188 I don't think I can reproduce this.. Could you share any configuration that will help me re-create this? or can you check if this works as expected with the latest SVN?

          thanks
          asankha

          Show
          Asankha C. Perera added a comment - Andreas After I fixed 168 and 188 I don't think I can reproduce this.. Could you share any configuration that will help me re-create this? or can you check if this works as expected with the latest SVN? thanks asankha
          Hide
          Andreas Veithen added a comment -

          I'm still having the same issue with the latest sources from SVN. After investigating what happens at the TCP/IP level, it seems that this error can be reproduced when all of the following conditions are met:
          1) The target host is up.
          2) The target port is closed.
          3) The target OS doesn't respond to SYN packets.

          Note that:

          • When condition 1 is not met, you might get "java.net.SocketException: Network is unreachable" instead of the NPE.
          • When condition 3 is not met, you will always get "java.net.ConnectException: Connection refused".

          Probably, when you tried to reproduce the problem, you were in one of these cases.

          Also note that normally a host is supposed to reply with a RST packet when the destination port is closed. In my case, the target host is a Windows machine. It doesn't respond at all to the SYN packets, probably due to some firewall settings. Since no reply is received whatsoever (neither an ICMP destination network/host/port unreachable nor a TCP RST), we are in the case where Socket#connect would throw a java.net.SocketTimeoutException.

          Show
          Andreas Veithen added a comment - I'm still having the same issue with the latest sources from SVN. After investigating what happens at the TCP/IP level, it seems that this error can be reproduced when all of the following conditions are met: 1) The target host is up. 2) The target port is closed. 3) The target OS doesn't respond to SYN packets. Note that: When condition 1 is not met, you might get "java.net.SocketException: Network is unreachable" instead of the NPE. When condition 3 is not met, you will always get "java.net.ConnectException: Connection refused". Probably, when you tried to reproduce the problem, you were in one of these cases. Also note that normally a host is supposed to reply with a RST packet when the destination port is closed. In my case, the target host is a Windows machine. It doesn't respond at all to the SYN packets, probably due to some firewall settings. Since no reply is received whatsoever (neither an ICMP destination network/host/port unreachable nor a TCP RST), we are in the case where Socket#connect would throw a java.net.SocketTimeoutException.
          Hide
          Asankha C. Perera added a comment -

          I think I'm going to need more time to fix this one property - esp to test it, and thus postponing to fix post 1.1.1

          Show
          Asankha C. Perera added a comment - I think I'm going to need more time to fix this one property - esp to test it, and thus postponing to fix post 1.1.1
          Hide
          Andreas Veithen added a comment -

          The problem actually is that when a connection timeout occurs, DefaultConnectingIOReactor doesn't propagate an exception to SessionRequestCallback (there is none to propagate). This causes the NPE in the anonymous SessionRequestCallback implementation inside HttpCoreNIOSender. Maybe Oleg can comment on this? The attached patch corrects this issue. I now have another error, but I think this one is related to a misconfiguration (no proper fault sequence defined).

          Show
          Andreas Veithen added a comment - The problem actually is that when a connection timeout occurs, DefaultConnectingIOReactor doesn't propagate an exception to SessionRequestCallback (there is none to propagate). This causes the NPE in the anonymous SessionRequestCallback implementation inside HttpCoreNIOSender. Maybe Oleg can comment on this? The attached patch corrects this issue. I now have another error, but I think this one is related to a misconfiguration (no proper fault sequence defined).
          Hide
          Oleg Kalnichevski added a comment -

          Andreas,

          Unlike classic I/O NIO does not use SocketTimeoutException to signal a timeout on a socket read operation and leaves it up to the transport layer to decide how to propagate the timeout condition to the application layer. So, in case of session request failing due to a timeout there is no exception instance to start with, which could be propagated to the SessionRequestCallback. Theoretically I could create one, but do not see a point in doing so. SessionRequestCallback implementations ought to be prepared to deal with SessionRequest#getException() being null especially in case of a timeout.

          Hope this helps

          Oleg

          Show
          Oleg Kalnichevski added a comment - Andreas, Unlike classic I/O NIO does not use SocketTimeoutException to signal a timeout on a socket read operation and leaves it up to the transport layer to decide how to propagate the timeout condition to the application layer. So, in case of session request failing due to a timeout there is no exception instance to start with, which could be propagated to the SessionRequestCallback. Theoretically I could create one, but do not see a point in doing so. SessionRequestCallback implementations ought to be prepared to deal with SessionRequest#getException() being null especially in case of a timeout. Hope this helps Oleg
          Hide
          Andreas Veithen added a comment -

          Thanks Oleg for your quick feedback. It confirms the analysis I based my patch on.

          Show
          Andreas Veithen added a comment - Thanks Oleg for your quick feedback. It confirms the analysis I based my patch on.
          Hide
          Andreas Veithen added a comment -

          After setting up an appropriate fault sequence in my synapse.xml, the timeout fault is correctly reported back to the client. However, after this Synapse starts looping on the following error:

          2008-01-13 13:24:39,248 [-] [HttpCoreNIOSender] ERROR HttpCoreNIOSender Unable to report back failure to the message receiver
          org.apache.axis2.AxisFault: A message was added that is not valid. However, the operation context was complete.
          at org.apache.axis2.description.TwoChannelAxisOperation.addFaultMessageContext(TwoChannelAxisOperation.java:102)
          at org.apache.axis2.util.MessageContextBuilder.createFaultMessageContext(MessageContextBuilder.java:260)
          at org.apache.synapse.transport.nhttp.HttpCoreNIOSender$3.handleError(HttpCoreNIOSender.java:468)
          at org.apache.synapse.transport.nhttp.HttpCoreNIOSender$3.timeout(HttpCoreNIOSender.java:439)
          at org.apache.http.impl.nio.reactor.SessionRequestImpl.timeout(SessionRequestImpl.java:151)
          at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processTimeouts(DefaultConnectingIOReactor.java:152)
          at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:96)
          at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:158)
          at org.apache.synapse.transport.nhttp.HttpCoreNIOSender.executeClientEngine(HttpCoreNIOSender.java:139)
          at org.apache.synapse.transport.nhttp.HttpCoreNIOSender.access$000(HttpCoreNIOSender.java:68)
          at org.apache.synapse.transport.nhttp.HttpCoreNIOSender$1.run(HttpCoreNIOSender.java:101)
          at java.lang.Thread.run(Thread.java:613)

          This message is shown every second. After looking at the code in httpcore, it seems that a timeout actually doesn't cancel the connection request. I was able to work around this problem by adding a call to SessionRequest#cancel in the timeout method of the SessionRequestCallback implementation in HttpCoreNIOSender.

          I attached a new version of the patch that solves the problem for me.

          @Oleg: Can you comment on the timeout-cancel issue?
          @Asankha: I would definitely like to see this issue solved in 1.1.1 (remember that when the NPE occurs, Synapse stops working correctly!). Do you think this is possible?

          Show
          Andreas Veithen added a comment - After setting up an appropriate fault sequence in my synapse.xml, the timeout fault is correctly reported back to the client. However, after this Synapse starts looping on the following error: 2008-01-13 13:24:39,248 [-] [HttpCoreNIOSender] ERROR HttpCoreNIOSender Unable to report back failure to the message receiver org.apache.axis2.AxisFault: A message was added that is not valid. However, the operation context was complete. at org.apache.axis2.description.TwoChannelAxisOperation.addFaultMessageContext(TwoChannelAxisOperation.java:102) at org.apache.axis2.util.MessageContextBuilder.createFaultMessageContext(MessageContextBuilder.java:260) at org.apache.synapse.transport.nhttp.HttpCoreNIOSender$3.handleError(HttpCoreNIOSender.java:468) at org.apache.synapse.transport.nhttp.HttpCoreNIOSender$3.timeout(HttpCoreNIOSender.java:439) at org.apache.http.impl.nio.reactor.SessionRequestImpl.timeout(SessionRequestImpl.java:151) at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processTimeouts(DefaultConnectingIOReactor.java:152) at org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:96) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:158) at org.apache.synapse.transport.nhttp.HttpCoreNIOSender.executeClientEngine(HttpCoreNIOSender.java:139) at org.apache.synapse.transport.nhttp.HttpCoreNIOSender.access$000(HttpCoreNIOSender.java:68) at org.apache.synapse.transport.nhttp.HttpCoreNIOSender$1.run(HttpCoreNIOSender.java:101) at java.lang.Thread.run(Thread.java:613) This message is shown every second. After looking at the code in httpcore, it seems that a timeout actually doesn't cancel the connection request. I was able to work around this problem by adding a call to SessionRequest#cancel in the timeout method of the SessionRequestCallback implementation in HttpCoreNIOSender. I attached a new version of the patch that solves the problem for me. @Oleg: Can you comment on the timeout-cancel issue? @Asankha: I would definitely like to see this issue solved in 1.1.1 (remember that when the NPE occurs, Synapse stops working correctly!). Do you think this is possible?
          Hide
          Asankha C. Perera added a comment -

          Andreas

          The reason for the delay in looking at this in detail was due to setting up a test environment with Windows etc to reproduce.. thought I wasn't able to actually reproduce it or test the patch yet, I am confident that you would have done it already, and thus I am committing this so that it will go into 1.1.1

          asankha

          Show
          Asankha C. Perera added a comment - Andreas The reason for the delay in looking at this in detail was due to setting up a test environment with Windows etc to reproduce.. thought I wasn't able to actually reproduce it or test the patch yet, I am confident that you would have done it already, and thus I am committing this so that it will go into 1.1.1 asankha

            People

            • Assignee:
              Asankha C. Perera
              Reporter:
              Andreas Veithen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development