Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-8609

Improve efficiency of converting Record object to Avro GenericRecord object

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.14.0
    • Component/s: Extensions
    • Labels:
      None

      Description

      During some performance tests and profiling, I found that for a given flow, pushing Avro records to Kafka, one of the most expensive parts of the flow was converting our Record (MapRecord) object into a GenericRecord object for the Avro Writer.

      I created a simple unit test to determine a baseline for performance numbers before making any changes. The unit test creates a Record with 100 null String fields, half of which have a null value assigned to them. I then converted the record into an Avro GenericRecord via AvroTypeUtil.createAvroRecord(record, avroSchema); in a loop of 1,000,000 iterations and output how long it took; this was then repeated 1,000 times in order to allow the JVM to warm up.

      Numbers on my Macbook Pro showed after the first few iterations that the amount of time needed to convert 1 million records was on the order of 4.5 seconds.

      After updating the code, performance numbers are just under 2 seconds. So somewhere on the order of 2x better performance.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                markap14 Mark Payne
                Reporter:
                markap14 Mark Payne
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h