Uploaded image for project: 'SystemDS'
  1. SystemDS
  2. SYSTEMDS-1774

Improve Parfor parallelism for deep learning

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      When running the distributed MNIST LeNet example , each mini-batch could ideally run in parallel without interaction. We try to force parfor (j in 1:parallel_batches) at line 137 of nn/examples/mnist_lenet_distrib_sgd.dml to be parfor (j in 1:parallel_batches, mode=REMOTE_SPARK, opt=CONSTRAINED) use REMOTE_SPARK mode, but got some errors about org.apache.sysml.runtime.DMLRuntimeException: Not supported: Instructions of type other than CP instructions using the mode SPARK, and the error java.lang.NullPointerException using the mode HYBRID_SPARK. More log information can be found at the following comments.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Tenma Fei Hu
            Tenma Fei Hu
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Agile

                Future Sprint:
                Sprint 2
                View on Board

                Slack

                  Issue deployment