Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-9691

Create processors for joining record-oriented data with enrichment data

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.16.0
    • Extensions
    • None

    Description

      A powerful capability of NiFi is its ability to perform enrichment on data as it flows through the system. We have processors such as LookupRecord and GeoEnrichRecord. However, there are cases where these don't really provide the necessary capabilities for performing enrichment.

      A particularly powerful use case is when we have data that we want to enrich by calling out to some web service. In this case, we don't want to send our payload to the web service. Instead, we want to transform our payload into a request that is reasonable to send to a web service. Then, we want to take the result from that web service call and use it to enrich our original payload. NiFi does not currently offer a convenient mechanism for doing this.

      We should add two additional processors: ForkEnrichment and JoinEnrichment.

      ForkEnrichment would be used create a clone of the incoming FlowFile, assigning relevant attributes to the original and the clone and then sending each to a different relationship (original and enrichment).

      Data sent to the 'enrichment' connection can then be transformed into whatever is necessary to send as a request to the web service. The result would then be fed to the JoinEnrichment processor.

      JoinEnrichment should then take input from the "original" connection and the "enrichment" connection and join the records together.

      I can foresee three ways to join together the Records:

      • Correlating the records by their index in the FlowFile with a wrapper (i.e., there's a "wrapper" element that encapsulates the first record from the "original" FlowFile and the first record from the "enrichment" FlowFile.
      • Correlating the records by their index in the FlowFile and insert the Enrichment record into the original payload.
      • Use SQL with a JOIN clause to join the records based on some field within the data.

       

      Attachments

        Issue Links

          Activity

            People

              markap14 Mark Payne
              markap14 Mark Payne
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m