Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-1830

Improve the data locality for the tasks in ParFor body

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • SystemML 1.0.0
    • None
    • None
    • Sprint 5

    Description

      For RemoteParForSpark, the tasks are parallelized without considering the data locality of the input matrixes. It will cause a lot of data shuffling if the volume of the input data size is large.

      We can predict the data location of the input matrixes, and add these location information when parallelizing the ParFor program body.

      Attachments

        Issue Links

          Activity

            People

              Tenma Fei Hu
              Tenma Fei Hu
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: