Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-4943

Batch Duration capability from ExecuteProcess added to ExecuteStreamCommand

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.5.0
    • None
    • None
    • None

    Description

      It would be great to let the ExecuteStreamCommand processor to send FlowFiles per chunk of stdout using a given separator (common case: for each line from stdout).

      I have a case  of running the 3rd party CLI (linux) with the following behaviour:

      • Should be executed upon a FlowFile with attributes/content containing parameters to CLI
      • Accepts params via flags or environment variables
      • Writes output to stdout as a stream of JSON objects
      • The output might be huge (millions and millions of objects), which means caching stdout is not an option - each line/object should be sent as a separate FlowFile
      • The errors/log is written to stderr (might be very chatty)

      Using ExecuteProcessor is not an option (cannot be trigger by incoming FlowFile), but the way it treats stdout is what is desired.
      Using ExecuteStreamCommand is not an option as it buffers the output until the binary exists with a status code 0.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            oleksandr Oleksandr Lobunets
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: