Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-1009

Avoid spark context creation on parfor optimization

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • SystemML 0.11
    • None
    • None

    Description

      Currently, every parfor script triggers the lazy spark context creation, independent of its input data size and script in order to obtain memory budgets and parallelism. On small data the the spark context creation dominates end-to-end execution time. We should improve this to a configuration-only analysis, which would avoid the context creation.

      For example, here are the XS and S performance results for univariate statistics:

      UnivariateStatistics on mbperftest/bivar/A_10k/data: 14
      UnivariateStatistics on mbperftest/bivar/A_10k/data: 14
      UnivariateStatistics on mbperftest/bivar/A_10k/data: 17
      UnivariateStatistics on mbperftest/bivar/A_10k/data: 16
      
      UnivariateStatistics on mbperftest/bivar/A_100k/data: 14
      UnivariateStatistics on mbperftest/bivar/A_100k/data: 15
      UnivariateStatistics on mbperftest/bivar/A_100k/data: 14
      UnivariateStatistics on mbperftest/bivar/A_100k/data: 17
      

      Attachments

        Issue Links

          Activity

            People

              mboehm7 Matthias Boehm
              mboehm7 Matthias Boehm
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: