Uploaded image for project: 'SystemML'
  1. SystemML
  2. SYSTEMML-1774

Improve Parfor parallelism for deep learning

    XMLWordPrintableJSON

    Details

      Description

      When running the distributed MNIST LeNet example , each mini-batch could ideally run in parallel without interaction. We try to force parfor (j in 1:parallel_batches) at line 137 of nn/examples/mnist_lenet_distrib_sgd.dml to be parfor (j in 1:parallel_batches, mode=REMOTE_SPARK, opt=CONSTRAINED) use REMOTE_SPARK mode, but got some errors about org.apache.sysml.runtime.DMLRuntimeException: Not supported: Instructions of type other than CP instructions using the mode SPARK, and the error java.lang.NullPointerException using the mode HYBRID_SPARK. More log information can be found at the following comments.

        Attachments

        1. Explain_For_HYBRID_SPARK_Mode_With_ErrorInfo.txt
          131 kB
          Fei Hu
        2. Explain_For_Spark_Mode.txt
          118 kB
          Fei Hu
        3. MNIST_Distrib_Sgd.scala
          1 kB
          Fei Hu
        4. mnist_lenet_distrib_sgd.dml
          18 kB
          Fei Hu

          Issue Links

            Activity

              People

              • Assignee:
                Tenma Fei Hu
                Reporter:
                Tenma Fei Hu
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: