[SPARK-3389] Add converter class to make reading Parquet files easy with PySpark - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.2.0
Component/s: PySpark
Labels:
None

Target Version/s:

1.2.0

Description

If a user wants to read Parquet data from PySpark, they currently must use SparkContext.newAPIHadoopFile. If they do not provide a valueConverter, they will get JSON string that must be parsed. Here I add a Converter implementation based on the one in the AvroConverters.scala file.

Attachments

Issue Links

links to

[Github] Pull Request #2256 (laserson)

Activity

People

Assignee:: Uri Laserson

Reporter:: Uri Laserson

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 03/Sep/14 23:02

Updated:: 28/Sep/14 04:56

Resolved:: 28/Sep/14 04:56