Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16566

Bug in SparseMatrix multiplication with SparseVector

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 1.6.2
    • Fix Version/s: None
    • Component/s: MLlib
    • Labels:
      None

      Description

      In the org.apache.spark.mllib.linalg.BLAS.scala, the multiplication between SparseMatrix (sm) and SparseVector (sv) when sm is not transposed assume that the indices is sorted, but there is no validation to make sure that is the case, making the result returned wrongly.

      This can be replicated simply by using spark-shell and entering these commands:

      import org.apache.spark.mllib.linalg.SparseMatrix
      import org.apache.spark.mllib.linalg.SparseVector
      import org.apache.spark.mllib.linalg.DenseVector
      import scala.collection.mutable.ArrayBuffer

      val vectorIndices = Array(3,2)
      val vectorValues = Array(0.1,0.2)
      val size = 4

      val sm = new SparseMatrix(size, size, Array(0, 0, 0, 1, 1), Array(0), Array(1.0))
      val dm = sm.toDense
      val sv = new SparseVector(size, vectorIndices, vectorValues)
      val dv = new DenseVector(s.toArray)

      sm.multiply(dv) == sm.multiply(sv)

      sm.multiply(dv)
      sm.multiply(sv)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                wilson.lauw Wilson
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: