Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-9691

Create processors for joining record-oriented data with enrichment data



    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.16.0
    • Extensions
    • None


      A powerful capability of NiFi is its ability to perform enrichment on data as it flows through the system. We have processors such as LookupRecord and GeoEnrichRecord. However, there are cases where these don't really provide the necessary capabilities for performing enrichment.

      A particularly powerful use case is when we have data that we want to enrich by calling out to some web service. In this case, we don't want to send our payload to the web service. Instead, we want to transform our payload into a request that is reasonable to send to a web service. Then, we want to take the result from that web service call and use it to enrich our original payload. NiFi does not currently offer a convenient mechanism for doing this.

      We should add two additional processors: ForkEnrichment and JoinEnrichment.

      ForkEnrichment would be used create a clone of the incoming FlowFile, assigning relevant attributes to the original and the clone and then sending each to a different relationship (original and enrichment).

      Data sent to the 'enrichment' connection can then be transformed into whatever is necessary to send as a request to the web service. The result would then be fed to the JoinEnrichment processor.

      JoinEnrichment should then take input from the "original" connection and the "enrichment" connection and join the records together.

      I can foresee three ways to join together the Records:

      • Correlating the records by their index in the FlowFile with a wrapper (i.e., there's a "wrapper" element that encapsulates the first record from the "original" FlowFile and the first record from the "enrichment" FlowFile.
      • Correlating the records by their index in the FlowFile and insert the Enrichment record into the original payload.
      • Use SQL with a JOIN clause to join the records based on some field within the data.



        Issue Links



              markap14 Mark Payne
              markap14 Mark Payne
              0 Vote for this issue
              3 Start watching this issue



                Time Tracking

                  Original Estimate - Not Specified
                  Not Specified
                  Remaining Estimate - 0h
                  Time Spent - 1h 10m
                  1h 10m