Details
Description
spark.sql.execution.arrow.enabled was added when we add PySpark arrow optimization.
Later, in the current master, SparkR arrow optimization was added and it's controlled by the same configuration spark.sql.execution.arrow.enabled.
There look two issues about this:
1. spark.sql.execution.arrow.enabled in PySpark was added from 2.3.0 whereas SparkR optimization was added 3.0.0. The stability is different so it's problematic when we change the default value for one of both optimization first.
2. Suppose users want to share some JVM by PySpark and SparkR. They are currently forced to use the optimization for all or none.
Attachments
Issue Links
- relates to
-
SPARK-21187 Complete support for remaining Spark data types in Arrow Converters
- Resolved
-
SPARK-26759 Arrow optimization in SparkR's interoperability
- Resolved
- links to