Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-1308 Runtime feature extensions
  3. SYSTEMDS-1390

Avoid unnecessary caching of parfor spark datapartition-execute input

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Done
    • None
    • SystemML 0.14
    • APIs, Runtime
    • None

    Description

      This task aims to avoid unnecessary input caching for parfor spark datapartition-execute jobs (with grouping) in order to reduce the memory pressure and thus garbage collection overhead during shuffle and subsequent execution. We only apply this for the general case with grouping and if the input is a persisted rdd which has not been cached yet.

      Attachments

        Activity

          People

            mboehm7 Matthias Boehm
            mboehm7 Matthias Boehm
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: