I have a GetTwitter processor running on a 3-nodes NiFi cluster and configured to be executed on the primary node only.
The symptom is that there is a too high frequency of HTTP 420 ("Enhance Your Calm") exceptions on GetTwitter processor start.
I made the following tests:
- With only 1 NiFi node. I was able to start/stop GetTwitter processor 10 times in a raw without any errors.
- With 2 NiFi nodes running, HTTP 420 errors occurred after a few start/stop (sometimes even after a single start).
After an analysis of the source code and knowing https://issues.apache.org/jira/browse/NIFI-2592 I came to the conclusion that the GetTwitter processor is initializing the connection to Twitter API on each node of the cluster, even to non-primary nodes.
The `onScheduled()` method is run on every node (see: NIFI-2592) making connections to Twitter with `client.connect()`. Then the `onTrigger()` method consumes the tweets normally from the primary node.
Issue is that having more that one node initializing connections make Twitter API raise HTTP 420 errors.
- Change the behavior of `onScheduled()` method to run only on primary node (as proposed in NIFI-2592)
- Update GetTwitter processor implementation to not call `client.connect()` anymore from the `onScheduled()` method but only when PrimaryNodeState changes to ELECTED_PRIMARY_NODE (And when PrimaryNodeState changes to PRIMARY_NODE_REVOKED: perform a `client.stop()`)