Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
Sprint 2
Description
When running the distributed MNIST LeNet example , it works well in the hybrid mode. But in the Spark mode, there are some errors about
java.lang.NullPointerException and java.lang.ArrayIndexOutOfBoundsException: 1000 when reshaping the matrix. The involved functions are org.apache.sysml.runtime.matrix.data.LibMatrixReorg#reshapeSparse and org.apache.sysml.runtime.matrix.data.LibMatrixReorg#reshapeDense. The reason is that the output matrix index computed by org.apache.sysml.runtime.matrix.data.LibMatrixReorg#computeResultBlockIndex does not match the keys in the HashMap<MatrixIndexes,MatrixBlock> rix.
To reproduce the error, the attached scala file MNIST_Distrib_Sgd.scala could be used to run the distributed MNIST example.
In addition, if adding some codes to ignore the null output matrix block from MatrixBlock out = rix.get(ixtmp), the distributed MNIST example could run in the Spark mode, but the result may not be right.