Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-6905

GetTwitter processor, configured to run on primary node only, initializes connection to Twitter API from every NiFi cluster node, even on non-primary nodes

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.0
    • Fix Version/s: 1.11.0
    • Component/s: Extensions
    • Labels:

      Description

      I have a GetTwitter processor running on a 3-nodes NiFi cluster and configured to be executed on the primary node only.
      The symptom is that there is a too high frequency of HTTP 420 ("Enhance Your Calm") exceptions on GetTwitter processor start.

      I made the following tests:

      • With only 1 NiFi node. I was able to start/stop GetTwitter processor 10 times in a raw without any errors.
      • With 2 NiFi nodes running, HTTP 420 errors occurred after a few start/stop (sometimes even after a single start).

      After an analysis of the source code and knowing https://issues.apache.org/jira/browse/NIFI-2592 I came to the conclusion that the GetTwitter processor is initializing the connection to Twitter API on each node of the cluster, even to non-primary nodes.

      The `onScheduled()` method is run on every node (see: NIFI-2592) making connections to Twitter with `client.connect()`. Then the `onTrigger()` method consumes the tweets normally from the primary node.
      Issue is that having more that one node initializing connections make Twitter API raise HTTP 420 errors.

      ERROR
      org.apache.nifi.processors.twitter.GetTwitter
      GetTwitter[id=XYZ] Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect
      

      Proposed solutions:

      1. Change the behavior of `onScheduled()` method to run only on primary node (as proposed in NIFI-2592)
      2. Update GetTwitter processor implementation to not call `client.connect()` anymore from the `onScheduled()` method but only when PrimaryNodeState changes to ELECTED_PRIMARY_NODE (And when PrimaryNodeState changes to PRIMARY_NODE_REVOKED: perform a `client.stop()`)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                kourge-ch Kourge
                Reporter:
                kourge-ch Kourge
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h