Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-7696

MultiQueryRecord Processor

    XMLWordPrintableJSON

Details

    Description

      Context :

      QueryRecord is such a nice processor, it helps everyone to perform all kind of advanced queries on a wide range of data (CSV, JSON, ...thanks to the RecordAPI) increasing in its way NiFi ETL capacity by a big order of magnitude.
      I want to take that and push NiFi even further by giving it the possibility to do the same thing even on multiple FlowFiles as input making something like performing a join like query on multiple FlowFile a reality. 

       

      Proposal:

      Create a new processor called "MultiQueryRecord" which can be thought of technically as a being a child of QueryRecord and a MergeRecord processor, this processor will be able to take different FlowFiles from different sources, wait that all of the necessary FlowFiles is expecting are here before triggering and executing all the SQL queries provided as dynamic properties. 

       

      • Every FlowFile will have an attribute which contains the name of the "virtual table" that will be used in the SQL query. 
      • The user configures how many FlowFiles is expecting also the attribute name which is going to contain the table name and of course the correlation attribute name to differentiate FlowFiles issued from different runs. 
      • The user also defines of course all his SQL queries in the dynamic properties (same as we do now for the QueryRecord processor.

       

      The processor will use the same MergeBin concept as in the MergeRecord processor to handle the pending FlowFiles while waiting for all of them to arrive before executing all the defined SQL queries.

       

      Implementation:

      I've already implemented this processor and would like to contribute to this wonderful project, i'm about to finish all the unit tests and stuff and will update this issue with my PR if you are interested by.

      Attachments

        Activity

          People

            mahieddine Mahieddine Cherif
            mahieddine Mahieddine Cherif
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 168h
                168h
                Remaining:
                Remaining Estimate - 168h
                168h
                Logged:
                Time Spent - Not Specified
                Not Specified