Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-14020

Add Record and Demarcator support to ConsumeGCPubSub

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Extensions
    • None

    Description

      At the moment, ConsumeGCPubSub will generate one FlowFile per consumed message (the Batch Size property is only specifying the maximum number of messages we may pull from the subscription with one API call). This can be extremely inefficient.

      Similarly to the Kafka processors, we should add the option to have multiple Processing Strategies:

      • Flow File - which is the current behavior - where one message is one FlowFile and FlowFile attributes will be used to store the attributes associated with the message as well as some information such as message ID, ack ID, etc.
      • Demarcator - where messages will be appended into a single FlowFile with a custom demarcator between each message. In this case specific attributes associated to messages will be lost. This however is the most efficient strategy if very high throughput is required and message format is allowing this approach.
      • Record - where a reader and writer can be specified to process the messages. This is useful if we want to change message format on the fly or if the message format is not allowing the demarcator strategy. In addition, an output strategy is available with two allowable values:
        • Value - messages are all added in the same flowfile with the specified writer. In this case specific attributes associated to messages will be lost.
        • Wrapper - in this case, we are overriding the schema of the writer to include the metadata of the message as well as a map of its attributes.

      Attachments

        Issue Links

          Activity

            People

              pvillard Pierre Villard
              pvillard Pierre Villard
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h