Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-1774

Improve Parfor parallelism for deep learning

    XMLWordPrintableJSON

Details

    Description

      When running the distributed MNIST LeNet example , each mini-batch could ideally run in parallel without interaction. We try to force parfor (j in 1:parallel_batches) at line 137 of nn/examples/mnist_lenet_distrib_sgd.dml to be parfor (j in 1:parallel_batches, mode=REMOTE_SPARK, opt=CONSTRAINED) use REMOTE_SPARK mode, but got some errors about org.apache.sysml.runtime.DMLRuntimeException: Not supported: Instructions of type other than CP instructions using the mode SPARK, and the error java.lang.NullPointerException using the mode HYBRID_SPARK. More log information can be found at the following comments.

      Attachments

        1. mnist_lenet_distrib_sgd.dml
          18 kB
          Fei Hu
        2. Explain_For_Spark_Mode.txt
          118 kB
          Fei Hu
        3. MNIST_Distrib_Sgd.scala
          1 kB
          Fei Hu
        4. Explain_For_HYBRID_SPARK_Mode_With_ErrorInfo.txt
          131 kB
          Fei Hu

        Issue Links

          Activity

            People

              Tenma Fei Hu
              Tenma Fei Hu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: