[SPARK-10705] Stop converting internal rows to external rows in DataFrame.toJSON - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.3.1, 1.4.1, 1.5.0
Fix Version/s: 1.6.0
Component/s: SQL
Labels:
None

Target Version/s:

1.6.0

Description

DataFrame.toJSON uses DataFrame.mapPartitions, which converts internal rows to external rows. We can use queryExecution.toRdd.mapPartitions instead for better performance.

Another issue is that, for UDT values, serialize produces internal types. So currently we must deal with both internal and external types within toJSON (see here), which is pretty weird.

Attachments

Issue Links

links to

[Github] Pull Request #8865 (viirya)

Activity

People

Assignee:: L. C. Hsieh

Reporter:: Cheng Lian

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 18/Sep/15 19:22

Updated:: 24/Sep/15 19:52

Resolved:: 24/Sep/15 19:52