[SPARK-12931] Improve bucket read path to only create one single RDD - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: SQL
Labels:
None

Description

Currently we will create one RDD per bucket and coalesce it to one partition, and finally union them to a final RDD. We should create a single RDD instead, it requires to modify the data source interface a little bit and abstract the logic of reader out to decouple it from RDD.

Attachments

Activity

People

Assignee:: Michael Armbrust

Reporter:: Wenchen Fan

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 20/Jan/16 20:16

Updated:: 02/Jun/16 17:23

Resolved:: 16/Apr/16 06:44