Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19184

Improve numerical stability for method tallSkinnyQR.

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: 2.2.0
    • Fix Version/s: None
    • Component/s: MLlib
    • Labels:

      Description

      In method tallSkinnyQR, the final Q is calculated by A * inv(R) (Github Link). When the upper triangular matrix R is ill-conditioned, computing the inverse of R can result in catastrophic cancellation. Instead, we should consider using a forward solve for solving Q such that Q * R = A.

      I first create a 4 by 4 RowMatrix A = (1,1,1,1;0,1E-5,0,0;0,0,1E-10,1;0,0,0,1E-14), and then I apply method tallSkinnyQR to A to find RowMatrix Q and Matrix R such that A = Q*R. In this case, A is ill-conditioned and so is R.

      See codes in Spark Shell:

      import org.apache.spark.mllib.linalg.{Matrices, Vector, Vectors}
      import org.apache.spark.mllib.linalg.distributed.RowMatrix
      
      // Create RowMatrix A.
      val mat = Seq(Vectors.dense(1,1,1,1), Vectors.dense(0, 1E-5, 1,1), Vectors.dense(0,0,1E-10,1), Vectors.dense(0,0,0,1E-14))
      val denseMat = new RowMatrix(sc.parallelize(mat, 2))
      
      // Apply tallSkinnyQR to A.
      val result = denseMat.tallSkinnyQR(true)
      
      // Print the calculated Q and R.
      result.Q.rows.collect.foreach(println)
      result.R
      
      // Calculate Q*R. Ideally, this should be close to A.
      val reconstruct = result.Q.multiply(result.R)
      reconstruct.rows.collect.foreach(println)
      
      // Calculate Q'*Q. Ideally, this should be close to the identity matrix.
      result.Q.computeGramianMatrix()
      
      System.exit(0)
      

      it will output the following results:

      scala> result.Q.rows.collect.foreach(println)
      [1.0,0.0,0.0,1.5416524685312E13]
      [0.0,0.9999999999999999,0.0,8011776.0]
      [0.0,0.0,1.0,0.0]
      [0.0,0.0,0.0,1.0]
      
      scala> result.R
      1.0  1.0     1.0      1.0
      0.0  1.0E-5  1.0      1.0
      0.0  0.0     1.0E-10  1.0
      0.0  0.0     0.0      1.0E-14
      
      scala> reconstruct.rows.collect.foreach(println)
      [1.0,1.0,1.0,1.15416524685312]
      [0.0,9.999999999999999E-6,0.9999999999999999,1.00000008011776]
      [0.0,0.0,1.0E-10,1.0]
      [0.0,0.0,0.0,1.0E-14]
      
      scala> result.Q.computeGramianMatrix()
      1.0                 0.0                 0.0  1.5416524685312E13
      0.0                 0.9999999999999998  0.0  8011775.999999999
      0.0                 0.0                 1.0  0.0
      1.5416524685312E13  8011775.999999999   0.0  2.3766923337289844E26
      

      With forward solve for solving Q such that Q * R = A rather than computing the inverse of R, it will output the following results instead:

      scala> result.Q.rows.collect.foreach(println)
      [1.0,0.0,0.0,0.0]
      [0.0,1.0,0.0,0.0]
      [0.0,0.0,1.0,0.0]
      [0.0,0.0,0.0,1.0]
      
      scala> result.R
      1.0  1.0     1.0      1.0
      0.0  1.0E-5  1.0      1.0
      0.0  0.0     1.0E-10  1.0
      0.0  0.0     0.0      1.0E-14
      
      scala> reconstruct.rows.collect.foreach(println)
      [1.0,1.0,1.0,1.0]
      [0.0,1.0E-5,1.0,1.0]
      [0.0,0.0,1.0E-10,1.0]
      [0.0,0.0,0.0,1.0E-14]
      
      scala> result.Q.computeGramianMatrix()
      1.0  0.0  0.0  0.0
      0.0  1.0  0.0  0.0
      0.0  0.0  1.0  0.0
      0.0  0.0  0.0  1.0
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              hl475 Huamin Li
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: