-
Type:
New Feature
-
Status: Resolved
-
Priority:
Major
-
Resolution: Duplicate
-
Affects Version/s: 1.6.0, 2.0.0
-
Fix Version/s: None
-
Component/s: Spark Core, SQL
-
Labels:None
Idea.
Apache Arrow (http://arrow.apache.org/) is Open Source implementation of inmemory columnar store. It has APIs in many programming languages.
We can think about using it in Apache Spark to avoid data (de-)serialization
when running PySpark (and R) UDFs.
- duplicates
-
SPARK-13534 Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas
-
- Resolved
-