Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-6736

If not given enough threads, Load Balanced Connections may block for long periods of time without making progress

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 1.10.0
    • Core Framework
    • None

    Description

      When load-balanced connections are used, we have a few different properties that we can configure. Specifically, the properties with their default values are:

      nifi.cluster.load.balance.connections.per.node=4
      nifi.cluster.load.balance.max.thread.count=8
      nifi.cluster.load.balance.comms.timeout=30 sec

      If the max thread count is below the number of connections per node * number of nodes in the cluster, everything still works well when there are reasonably high data volumes across all connections that are load-balanced. However, if one of the connections has a low data volume, we can get into a situation where the load balanced connections stop pushing data for some period of time, typically approximately some multiple of the "comms.timeout" property.

      This appears to be due to the fact that the server is using Socket IO and not NIO and once data has been received, it will check if more data is available. If it does not receive any indication for some period of time, it will time out. Only then does it add the socket connection back to a pool of connections to read from. This means that the thread can be stuck, waiting to receive more data, and blocking any progress from other connections on that thread.

      Attachments

        Issue Links

          Activity

            People

              markap14 Mark Payne
              markap14 Mark Payne
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m