[SPARK-39301] Levearge LocalRelation in createDataFrame with Arrow optimization - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.4.0
Fix Version/s: 3.4.0
Component/s: PySpark
Labels:
None

Description

Currently, we use LogicalRDD that always creates an RDD. in Spark SQL, we have some nice optimization with LocalRelation. We should leverage this in createDataFrame in PySpark with Arrow optimization to boost the speed up.

Attachments

Issue Links

links to

[Github] Pull Request #36683 (HyukjinKwon)

Activity

People

Assignee:: Hyukjin Kwon

Reporter:: Hyukjin Kwon

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 26/May/22 05:08

Updated:: 12/Dec/22 18:10

Resolved:: 13/Jun/22 11:10