Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-7557

Cache large/common FlowFile attributes when restoring FlowFile Repository

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.13.0
    • Core Framework
    • None

    Description

      When NiFi is restarted, it restores FlowFiles from the repository. Each attribute on a FlowFile is read from disk and put into a HashMap. There are times when a Processor will add a large attribute to every FlowFile that it sees, and this results in using much more heap upon NiFi restart to store FlowFiles than it does while NiFi is running. This is because the Processor holds the value of that FlowFile as a single String object and adds that String to the HashMap of attributes on every FlowFile.

      However, on restart, NiFi deserializes a byte stream to come up with the attribute value. As a result, each FlowFile that has that attribute value ends up with its own String object, even though the same value is repeated many times.

      As a result, a huge amount of heap may be used on restart, causing NiFi to encounter OOME when attempting to restore the FlowFile Repository.

      Attachments

        Issue Links

          Activity

            People

              markap14 Mark Payne
              markap14 Mark Payne
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m