Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-4683

Add ability to execute Spark jobs via Livy

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.5.0
    • Extensions
    • None

    Description

      Proposal for a new feature to enable NiFi users to execute Spark jobs. A natural entry point for this is to use Apache Livy, as it is a "REST service for Apache Spark". This would allow NiFi to submit Spark jobs without needing to bundle a Spark client itself (and maintain versions of Spark, e.g.).

      Some of the components that could be involved include:

      LivySessionController Controller Service (CS) - provides connections to available sessions in Livy

      • Users could request a type of connection or to retrieve the same connection back by session id if available.
      • Properties to configure Livy session such as number of executors, memory
      • Property for connection pool size
      • Will interact with Livy ensure that only connections that are idle/available are added to the pool and checked back in
      • Key for pool could be based on session id or type
      • Ensure to provide any user credentials
      • Leverages SSLContext for security

      LivyProcessor

      • Obtains Spark JARs/files via properties and/or flow file attribute(s)
      • Obtains connection information from LivySessionController
      • Provides attributes to configure session, maintain session id, attach to session id
      • Potential advanced UI available for testing code (probably a follow-on Jira)

      Attachments

        Issue Links

          Activity

            People

              mattyb149 Matt Burgess
              mattyb149 Matt Burgess
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: