Details

    • Sub-task
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • None
    • None
    • APIs, Runtime
    • None

    Description

      The parfor optimizer may decide to execute the entire loop as a remote Spark job to utilize cluster parallelism. In this case all inputs to the parfor body (i.e., variable that are created or read outside of the parfor body but used or overwritten inside) are read from HDFS. In the past there was an issue of redundant reads, which has been addressed with SYSTEMML-1879. However, the direct use of Spark broadcast variables would likely improve performance, especially in clusters with many nodes.

      This task aims to leverage Spark broadcast variables for all parfor inputs. In detail this entails two major aspects. First, we need runtime support to optionally broadcast the inputs via broadcast variables in RemoteParForSpark and obtain them from these broadcast variables in RemoteParForSparkWorker without causing unnecessary eviction. In contrast, to the existing broadcast primitives, we don't need to blockify the matrix because the matrix is accessed in full by in-memory operations. Second, this requires an extension of the parfor optimizer to reason about scenarios where it is safe to use broadcast because these broadcasts cause additional memory requirements since they act as pinned in memory matrices. This second task has likely overlap with SYSTEMML-1349 which requires a similar reasoning to handle shared reads.

      Attachments

        Issue Links

          Activity

            People

              Guobao LI Guobao
              mboehm7 Matthias Boehm
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: