Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
When a connection is configured to use Round Robin load balancing, the FlowFIle Queue works by queuing up one FlowFile to be processed locally, one to be sent to Node 2, one to be sent to Node 3, the next one to be locally processed, etc. (in this case, assuming a 3-node cluster).
If one node in a cluster is slow, though, we can have a situation where the local partition is empty and the partition for Node 2 is empty. But Node 3's partition is full, because Node 3 is not processing the data quickly enough. As a result, on Node 1, the queue ends up applying backpressure, with all FlowFiles in the queue waiting to be pushed to Node 3.
In such a situation, we end up preventing any data from being processed by Node 1 or Node 2. It would be advantageous to improve this so that Node 1 and Node 2 could still be busy processing data.