Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-11240

Introduce Python API for building Processors

    XMLWordPrintableJSON

Details

    • Introduce Python API for building Processors
    • To Do

    Description

      The scripting processors are very common for data transformation in NiFi. In particular, the Jython based scripts are quite heavily used. However, Jython is run on the JVM and does not support CPython libraries. As a result, it's syntax compatible but doesn't make use of the wealth of Python libraries. And the wealth of Python libraries are what make Python popular to begin with.

      Additionally, use of many script-based processors hurts the UX. They are cumbersome to configure, with script files and/or script bodies. They result in a dataflow that's difficult to understand because instead of nicely named processors like CompressContent the type and default name are "ExecuteScript." They're also difficult to share.

      I have been playing with Py4J for introduce a true Python-based API for developing Processors. This will introduce new APIs, new framework changes, and documentation. And this will likely take a while to stabilize. However, the sooner that we are able to land it into the hands of users, the better. Therefore, I pose that we introduce it in multiple milestones. We can create sub-tickets for different milestones, but in general it should follow:

      Milestone 1: Initial implementation. Provides the capability and an API for building processors. Includes sample code and some documentation. Includes tests to ensure proper operation. Should not be used in production. API will not be stable and may change frequently. Performance may be subpar. Get into the hands of developers to begin exploring and providing feedback / submitting PRs.

      Milestone 2: Bug fixes. API refinement. Improve performance.

      Milestone 3: Additional bug fixes and API refinement. API should become more stable.

      Milestone 4: Additional bug fixes. API becomes stable. Documentation is clear and sufficient. Recommend production use.

       

       

      Attachments

        Activity

          People

            markap14 Mark Payne
            markap14 Mark Payne
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 9h
                9h