-
Type:
Epic
-
Status: Reopened
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 2.2.0
-
Fix Version/s: None
-
Component/s: PySpark
-
Labels:None
This is an umbrella ticket tracking the general effort to improve performance and interoperability between PySpark and Pandas. The core idea is to Apache Arrow as serialization format to reduce the overhead between PySpark and Pandas.
- incorporates
-
SPARK-21187 Complete support for remaining Spark data types in Arrow Converters
-
- Resolved
-
SPARK-22216
unlabelled-SPARK-22216
true
SPARK-22216
unlabelled-SPARK-22216