[SPARK-1379] Calling .cache() on a SchemaRDD should do something more efficient than caching the individual row objects. - ASF JIRA

Attach files

Attach Screenshot

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: SQL
Labels:
None

Target Version/s:

1.2.0

Description

Since rows aren't black boxes we could use InMemoryColumnarTableScan. This would significantly reduce GC pressure on the workers.

Attachments

Issue Links

Add Link

is related to

SPARK-3212 Improve the clarity of caching semantics

Resolved

Delete this link

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Michael Armbrust

Reporter:: Michael Armbrust

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 01/Apr/14 02:30

Updated:: 03/Oct/14 19:35

Resolved:: 03/Oct/14 19:35

Agile

View on Board

Calling .cache() on a SchemaRDD should do something more efficient than caching the individual row objects.

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Agile

Slack

Issue deployment