Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-13896

Improving TailFile performance

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.0.0, 1.29.0
    • None
    • None

    Description

      In case of tailing numerous files, the processor is slow because it repeatedly loops over a large number of tailed files and performs several expensive operations.

      • In the OnTrigger method, a loop (loop 1) iterates over all tailed files in the state object.
      • Inside this loop, for each tailed file, the recoverRolledFiles method is called (loop 2), which then leads to consumeFilesFully and finally triggers cleanup.
      • In the cleanup method, another loop (loop 3) iterates over all tailed files in the state again.
      • During the cleanup, persistState is invoked, which removes any legacy state variables from the NiFi state. These legacy state variables originate from NiFi 1.0, when support for "Multiple Tailed Files" was not available, so state keys didn’t have the "file.x." prefix. As the cleanup iterates over and persists each tailed file's state, the overall state size grows (adding six entries per tailed file). This causes the legacy cleanup loop to become progressively slower with each iteration as the number of state entries grows.

      This can lead to hours of execution time.

       

      Suggestion for improvement:

       

      • Moving out the loop that removes old state entries from cleanup. The cleanup of old entries should be run on the startup instead.
      for(String key : oldState.toMap().keySet()) {
          // These states are stored by older version of NiFi, and won't be used anymore.
          // New states have 'file.<index>.' prefix.
          if (TailFileState.StateKeys.CHECKSUM.equals(key)
                  || TailFileState.StateKeys.FILENAME.equals(key)
                  || TailFileState.StateKeys.POSITION.equals(key)
                  || TailFileState.StateKeys.TIMESTAMP.equals(key)) {
              getLogger().info("Removed state {}={} stored by older version of NiFi.", new Object[]{key, oldState.get(key)});
              continue;
          }
          updatedState.put(key, oldState.get(key));
      } 

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Lehel44 Lehel Boér
            Lehel44 Lehel Boér
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 20m
                1h 20m

                Slack

                  Issue deployment