Details
-
Umbrella
-
Status: Resolved
-
Major
-
Resolution: Done
-
3.4.0
-
None
-
None
Description
User-defined Functions in Python consist of (pickled) Python UDFs and (Arrow-optimized) Pandas UDFs. They enable users to run arbitrary Python code on top of the Apache Sparkā¢ engine. Users only have to state "what to do"; PySpark, as a sandbox, encapsulates "how to do it".
Spark Connect Python Client (SCPC), as a client and server interface for PySpark will eventually replace the legacy API of PySpark. Supporting PySpark UDFs is essential for Spark Connect to reach parity with the PySpark legacy API.
See design doc here.
Attachments
Issue Links
- is depended upon by
-
SPARK-42393 Support for Pandas/Arrow Functions API
- Resolved
- is related to
-
SPARK-42271 Reuse UDF test cases under `pyspark.sql.tests`
- Resolved