Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-6296

Support Python Spark Runner

Details

    • New Feature
    • Status: Resolved
    • P2
    • Resolution: Duplicate
    • 2.9.0
    • Not applicable
    • runner-spark

    Description

      Hello, everyone,

      It would be great to have a Python version of Spark runner available to Python.

      While we are happy of running Apache Beam on Dataflow, there are a few use cases that require different dependencies and OS env which makes it be more appropriate to run on a self-managed Spark cluster. With a spark runner for the python SDK, there will be an option to unify the language to define data pipelines.

      Would like to see the community's feedbacks of this feature.

      Attachments

        Issue Links

          Activity

            People

              amitsela Amit Sela
              eddyxu Lei (Eddy) Xu
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: