Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10875

RowMatrix.computeCovariance() result is not exactly symmetric

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.5.0
    • 1.6.0
    • MLlib
    • None

    Description

      For some matrices, I have seen that the computed covariance matrix is not exactly symmetric, most likely due to some numerical rounding errors. This is problematic when trying to construct an instance of MultivariateGaussian, because it requires an exactly symmetric covariance matrix. See reproducible example below.

      I would suggest modifying the implementation so that G(i, j) and G(j, i) are set at the same time, with the same value.

      val rdd = RandomRDDs.normalVectorRDD(sc, 100, 10, 0, 0)
      val matrix = new RowMatrix(rdd)
      val mean = matrix.computeColumnSummaryStatistics().mean
      val cov = matrix.computeCovariance()
      val dist = new MultivariateGaussian(mean, cov) //throws breeze.linalg.MatrixNotSymmetricException
      

      Attachments

        Activity

          People

            pnpritchard Nick Pritchard
            pnpritchard Nick Pritchard
            Xiangrui Meng Xiangrui Meng
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: