Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-5522

HandleHttpRequest enters in fault state and does not recover



    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.7.0, 1.7.1
    • 1.8.0
    • None


      HandleHttpRequest randomly enters in a fault state and does not recover until I restart the node. I feel the problem is triggered when some exception occurs (ex.: broken request, connection issues, etc), but I am usually able to reproduce this behavior stressing the node with tons of simultaneous requests:

      # example script to stress server
      for i in `seq 1 10000`; do
         wget ‐T10 ‐t10 ‐qO‐ ''>/dev/null &

      When this happens, HandleHttpRequest start to return "HTTP ERROR 503 - Service Unavailable" and does not recover from this state:

      If I try to stop the HandleHttpRequest processor, the running threads does not terminate:

      If I force them to terminate, the listen port continue being bound by NiFi:

      If I try to connect again, I got a HTTP ERROR 500:


      If I try to start the HandleHttpRequest processor again, it doesn't start with the message:

      • ERROR [Timer-Driven Process Thread-11] o.a.n.p.standard.HandleHttpRequest HandleHttpRequest[id=9bae326b-5ac3-3e9f-2dac-c0399d8f2ddb] Failed to process session due to org.apache.nifi.processor.exception.ProcessException: Failed to initialize the server: org.apache.nifi.processor.exception.ProcessException: Failed to initialize the server{{ org.apache.nifi.processor.exception.ProcessException: Failed to initialize the server}}{{ {{ at org.apache.nifi.processors.standard.HandleHttpRequest.onTrigger(HandleHttpRequest.java:501)}}}}{{ {{ at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)}}}}{{ {{ at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)}}}}{{ {{ at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)}}}}{{ {{ at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)}}}}{{ {{ at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)}}}}{{ {{ at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)}}}}{{ {{ at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)}}}}{{ {{ at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)}}}}{{ {{ at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)}}}}{{ {{ at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)}}}}{{ {{ at java.lang.Thread.run(Thread.java:748)}}}}{{ Caused by: java.net.BindException: Address already in use}}{{ {{ at sun.nio.ch.Net.bind0(Native Method)}}}}{{ {{ at sun.nio.ch.Net.bind(Net.java:433)}}}}{{ {{ at sun.nio.ch.Net.bind(Net.java:425)}}}}{{ {{ at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)}}}}{{ {{ at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)}}}}{{ {{ at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:298)}}}}{{ {{ at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)}}}}{{ {{ at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)}}}}{{ {{ at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)}}}}{{ {{ at org.eclipse.jetty.server.Server.doStart(Server.java:431)}}}}{{ {{ at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)}}}}{{ {{ at org.apache.nifi.processors.standard.HandleHttpRequest.initializeServer(HandleHttpRequest.java:430)}}}}{{ {{ at org.apache.nifi.processors.standard.HandleHttpRequest.onTrigger(HandleHttpRequest.java:489)}} { Unknown macro: { ... 11 common frames omitted}



      The only way to workaround this when it happens is chaging the port it listens to or restarting NiFi service. I flagged this as a security issue because it allows someone to cause a DoS to the service.

      I found several similar issues, but most of them are related with old versions, I am can confirm this affects versions 1.7.0 and 1.7.1.


        1. image-2018-08-15-21-10-27-926.png
          27 kB
          Diego Queiroz
        2. image-2018-08-15-21-10-33-515.png
          27 kB
          Diego Queiroz
        3. image-2018-08-15-21-11-57-818.png
          18 kB
          Diego Queiroz
        4. image-2018-08-15-21-15-35-364.png
          109 kB
          Diego Queiroz
        5. image-2018-08-15-21-19-34-431.png
          28 kB
          Diego Queiroz
        6. image-2018-08-15-21-20-31-819.png
          50 kB
          Diego Queiroz
        7. test_http_req_resp.xml
          18 kB
          Otto Fowler
        8. HandleHttpRequest_Error_Template.xml
          17 kB
          Diego Queiroz

        Issue Links



              Unassigned Unassigned
              Diego Queiroz Diego Queiroz
              0 Vote for this issue
              5 Start watching this issue