Details
-
New Feature
-
Status: Resolved
-
P2
-
Resolution: Duplicate
-
2.9.0
Description
Hello, everyone,
It would be great to have a Python version of Spark runner available to Python.
While we are happy of running Apache Beam on Dataflow, there are a few use cases that require different dependencies and OS env which makes it be more appropriate to run on a self-managed Spark cluster. With a spark runner for the python SDK, there will be an option to unify the language to define data pipelines.
Would like to see the community's feedbacks of this feature.
Attachments
Issue Links
- is duplicated by
-
BEAM-2891 Spark runs portable pipelines
- Resolved