Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-8773

Allow TailFile to hold off on ingesting lines of text if the full (multi-line) message is not available

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.15.0
    • Component/s: Extensions
    • Labels:
      None

      Description

      When using TailFile, there are times when multi-line messages are written to a file. For example, we may have something like:

      <1> My Message
      <2> My Message
      <3> My Message
         A continuation of my message
      

      If TailFile now runs, it will ingest these 4 lines of text as a FlowFile.
      Perhaps the next lines to get written, though, will be something like:

        Another continuation of my message
        A final continuation
      <4> Another Message
      <5> Yet another Message
      

      And we may want to avoid pulling in lines "<3> My Message" and " A continuation of my message" until we are able to fully consume the full message.

      We should enable this capability by allowing for a new property that specifies a Regular Expression to run against the start of a line. If we read a line from the file and it matches that Regex, then we know the previous message is complete. Otherwise, the previous message may not be complete and should be buffered (up to some configurable limit, in order to avoid exhausting the Java heap).

        Attachments

          Activity

            People

            • Assignee:
              markap14 Mark Payne
              Reporter:
              markap14 Mark Payne
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1.5h
                1.5h