Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-2699

Improve handling of response timeouts in cluster

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • Core Framework, Core UI
    • None

    Description

      When running as a cluster, if a node is unable to respond within the socket timeout (eg, hitting a breakpoint while debugging), an IllegalClusterStateException will be thrown that causes the UI to show the "check config and fix errors" page. Once the node is communicating with the cluster again (i.e., breakpoint in the code is passed), the UI can be reloaded and the cluster recovers from the timeout without any user intervention at the service level. However, user experience could be improved. If a user initiates a replicated request to a node that is unable to respond within the socket timeout duration, the user might think NiFi crashed, when it in fact didn't.

      Here is the stack trace that was encountered during testing:

      2016-08-29 11:36:59,041 DEBUG [NiFi Web Server-22] o.a.n.w.a.c.IllegalClusterStateExceptionMapper
      org.apache.nifi.cluster.manager.exception.IllegalClusterStateException: Node localhost:8443 is unable to fulfill this request due to: Unexpected Response Code 500
              at org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator$2.onCompletion(ThreadPoolRequestReplicator.java:471) ~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
              at org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator$NodeHttpRequest.run(ThreadPoolRequestReplicator.java:729) ~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_92]
              at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_92]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_92]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_92]
              at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
      Caused by: com.sun.jersey.api.client.ClientHandlerException: java.net.SocketTimeoutException: Read timed out
              at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155) ~[jersey-client-1.19.jar:1.19]
              at com.sun.jersey.api.client.Client.handle(Client.java:652) ~[jersey-client-1.19.jar:1.19]
              at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682) ~[jersey-client-1.19.jar:1.19]
              at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) ~[jersey-client-1.19.jar:1.19]
              at com.sun.jersey.api.client.WebResource$Builder.post(WebResource.java:560) ~[jersey-client-1.19.jar:1.19]
              at org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator.replicateRequest(ThreadPoolRequestReplicator.java:537) ~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
              at org.apache.nifi.cluster.coordination.http.replication.ThreadPoolRequestReplicator$NodeHttpRequest.run(ThreadPoolRequestReplicator.java:720) ~[nifi-framework-cluster-1.0.0-SNAPSHOT.jar:1.0.0-SNAPSHOT]
              ... 5 common frames omitted
      Caused by: java.net.SocketTimeoutException: Read timed out
              at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.8.0_92]
              at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[na:1.8.0_92]
              at java.net.SocketInputStream.read(SocketInputStream.java:170) ~[na:1.8.0_92]
              at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[na:1.8.0_92]
              at sun.security.ssl.InputRecord.readFully(InputRecord.java:465) ~[na:1.8.0_92]
              at sun.security.ssl.InputRecord.read(InputRecord.java:503) ~[na:1.8.0_92]
              at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973) ~[na:1.8.0_92]
              at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930) ~[na:1.8.0_92]
              at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) ~[na:1.8.0_92]
              at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) ~[na:1.8.0_92]
              at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) ~[na:1.8.0_92]
              at java.io.BufferedInputStream.read(BufferedInputStream.java:345) ~[na:1.8.0_92]
              at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704) ~[na:1.8.0_92]
              at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647) ~[na:1.8.0_92]
              at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536) ~[na:1.8.0_92]
              at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441) ~[na:1.8.0_92]
              at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) ~[na:1.8.0_92]
              at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:338) ~[na:1.8.0_92]
              at com.sun.jersey.client.urlconnection.URLConnectionClientHandler._invoke(URLConnectionClientHandler.java:253) ~[jersey-client-1.19.jar:1.19]
              at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:153) ~[jersey-client-1.19.jar:1.19]
              ... 11 common frames omitted
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            jtstorck Jeff Storck
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: