Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.14.0, 1.22.0
-
None
Description
When a queue reaches the swap threshold (defined in nifi.properties as nifi.queue.swap.threshold and defaulted to 20,000 FlowFiles), it enters 'swap mode'. However, it never exits swap mode.
This means that even if the queue is completely emptied, the data that does enter the queue will be swapped out if the queue reaches 10K FlowFiles. Additionally, there is significant overhead under the covers in handling this.
To replicate, create a simple flow:
GenerateFlowFile -> UpdateAttribute.
Set GenerateFlowFile to run with 6 threads, Run Schedule of "0 secs" and a Run Duration of "100 ms". Auto-terminate the 'success' relationship of UpdateAttribute
This will quickly fill the queue beyond 20K FlowFiles.
Now, stop GenerateFlowFile. Lower to 4 threads and a Run Duration of "10 ms"
Start both processors. Watch the logs indicating that data is constantly be swapped in and out.
This can have a very significant impact on performance. In my testing on my laptop, once this flow started swapping, its 5-minute stats dropped from 14.5 MM FlowFiles per 5 minutes down to 11 MM FlowFiles (roughly a 30% decline)
In addition to lower throughput, it causes much higher resource utilization, which affects all flows.
This defect may affect anyone using a large number of small FlowFiles, especially those where data may be bursty enough to exceed to 20,000 FlowFile swapping limit or flows that have Backpressure Threshold set beyond 10,000.
Attachments
Issue Links
- links to