Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-6905

GetTwitter processor, configured to run on primary node only, initializes connection to Twitter API from every NiFi cluster node, even on non-primary nodes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.0.0
    • 1.11.0
    • Extensions

    Description

      I have a GetTwitter processor running on a 3-nodes NiFi cluster and configured to be executed on the primary node only.
      The symptom is that there is a too high frequency of HTTP 420 ("Enhance Your Calm") exceptions on GetTwitter processor start.

      I made the following tests:

      • With only 1 NiFi node. I was able to start/stop GetTwitter processor 10 times in a raw without any errors.
      • With 2 NiFi nodes running, HTTP 420 errors occurred after a few start/stop (sometimes even after a single start).

      After an analysis of the source code and knowing https://issues.apache.org/jira/browse/NIFI-2592 I came to the conclusion that the GetTwitter processor is initializing the connection to Twitter API on each node of the cluster, even to non-primary nodes.

      The `onScheduled()` method is run on every node (see: NIFI-2592) making connections to Twitter with `client.connect()`. Then the `onTrigger()` method consumes the tweets normally from the primary node.
      Issue is that having more that one node initializing connections make Twitter API raise HTTP 420 errors.

      ERROR
      org.apache.nifi.processors.twitter.GetTwitter
      GetTwitter[id=XYZ] Received error HTTP_ERROR: HTTP/1.1 420 Enhance Your Calm. Will attempt to reconnect
      

      Proposed solutions:

      1. Change the behavior of `onScheduled()` method to run only on primary node (as proposed in NIFI-2592)
      2. Update GetTwitter processor implementation to not call `client.connect()` anymore from the `onScheduled()` method but only when PrimaryNodeState changes to ELECTED_PRIMARY_NODE (And when PrimaryNodeState changes to PRIMARY_NODE_REVOKED: perform a `client.stop()`)

      Attachments

        Issue Links

          Activity

            People

              kourge-ch Kourge
              kourge-ch Kourge
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h