Details
Description
There is a bug in how a transposed SparseMatrix (isTransposed=true) does multiplication with a SparseVector. The bug is present (for v. > 2.0.0) in both org.apache.spark.mllib.linalg.BLAS (mllib) and org.apache.spark.ml.linalg.BLAS (mllib-local) in the private gemv method with signature:
gemv(alpha: Double, A: SparseMatrix, x: SparseVector, beta: Double, y: DenseVector).
This bug can be verified by running the following snippet in a Spark shell (here using v1.6.1):
import com.holdenkarau.spark.testing.SharedSparkContext import org.apache.spark.mllib.linalg._ val A = Matrices.dense(3, 2, Array[Double](0, 2, 1, 1, 2, 0)).asInstanceOf[DenseMatrix].toSparse.transpose val b = Vectors.sparse(3, Seq[(Int, Double)]((1, 2), (2, 1))).asInstanceOf[SparseVector] A.multiply(b) A.multiply(b.toDense)
The first multiply with the SparseMatrix returns the incorrect result:
org.apache.spark.mllib.linalg.DenseVector = [5.0,0.0]
whereas the correct result is returned by the second multiply:
org.apache.spark.mllib.linalg.DenseVector = [5.0,4.0]