[SYSTEMDS-1390] Avoid unnecessary caching of parfor spark datapartition-execute input - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Done
Affects Version/s: None
Fix Version/s: SystemML 0.14
Component/s: APIs, Runtime
Labels:
None

Description

This task aims to avoid unnecessary input caching for parfor spark datapartition-execute jobs (with grouping) in order to reduce the memory pressure and thus garbage collection overhead during shuffle and subsequent execution. We only apply this for the general case with grouping and if the input is a persisted rdd which has not been cached yet.

Attachments

Activity

People

Assignee:: Matthias Boehm

Reporter:: Matthias Boehm

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 10/Mar/17 09:00

Updated:: 21/Apr/17 05:05

Resolved:: 10/Mar/17 20:16