Uploaded image for project: 'Commons Math'
  1. Commons Math
  2. MATH-789

Correlated random vector generator fails (silently) when faced with zero rows in covariance matrix

Rank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.0
    • 3.1
    • None
    • None
    • JDK 1.6 / Eclipse Indigo on Ubuntu 10.04

    Description

      The following three matrices (which are basically permutations of each other) produce different results when sampling a multi-variate Gaussian with the help of CorrelatedRandomVectorGenerator (sample covariances calculated in R, based on 10,000 samples):

      Array2DRowRealMatrix

      { {0.0,0.0,0.0,0.0,0.0}

      ,

      {0.0,0.013445532,0.01039469,0.009881156,0.010499559}

      ,

      {0.0,0.01039469,0.023006616,0.008196856,0.010732709}

      ,

      {0.0,0.009881156,0.008196856,0.019023866,0.009210099}

      ,
      {0.0,0.010499559,0.010732709,0.009210099,0.019107243}}

      > cov(data1)
      V1 V2 V3 V4 V5
      V1 0 0.000000000 0.00000000 0.000000000 0.000000000
      V2 0 0.013383931 0.01034401 0.009913271 0.010506733
      V3 0 0.010344006 0.02309479 0.008374730 0.010759306
      V4 0 0.009913271 0.00837473 0.019005488 0.009187287
      V5 0 0.010506733 0.01075931 0.009187287 0.019021483

      Array2DRowRealMatrix

      { {0.013445532,0.01039469,0.0,0.009881156,0.010499559}

      ,

      {0.01039469,0.023006616,0.0,0.008196856,0.010732709}

      ,

      {0.0,0.0,0.0,0.0,0.0}, {0.009881156,0.008196856,0.0,0.019023866,0.009210099},
      {0.010499559,0.010732709,0.0,0.009210099,0.019107243}}

      > cov(data2)
      V1 V2 V3 V4 V5
      V1 0.006922905 0.010507692 0 0.005817399 0.010330529
      V2 0.010507692 0.023428918 0 0.008273152 0.010735568
      V3 0.000000000 0.000000000 0 0.000000000 0.000000000
      V4 0.005817399 0.008273152 0 0.004929843 0.009048759
      V5 0.010330529 0.010735568 0 0.009048759 0.018683544

      Array2DRowRealMatrix{ {0.013445532,0.01039469,0.009881156,0.010499559}, {0.01039469,0.023006616,0.008196856,0.010732709}, {0.009881156,0.008196856,0.019023866,0.009210099},
      {0.010499559,0.010732709,0.009210099,0.019107243}}

      > cov(data3)
      V1 V2 V3 V4
      V1 0.013445047 0.010478862 0.009955904 0.010529542
      V2 0.010478862 0.022910522 0.008610113 0.011046353
      V3 0.009955904 0.008610113 0.019250975 0.009464442
      V4 0.010529542 0.011046353 0.009464442 0.019260317


      I've traced this back to the RectangularCholeskyDecomposition, which does not seem to handle the second matrix very well (decompositions in the same order as the matrices above):

      CorrelatedRandomVectorGenerator.getRootMatrix() =
      Array2DRowRealMatrix{{0.0,0.0,0.0,0.0,0.0}

      ,

      {0.0759577418122063,0.0876125188474239,0.0,0.0,0.0}

      ,

      {0.07764443622513505,0.05132821221460752,0.11976381821791235,0.0,0.0}

      ,

      {0.06662930527909404,0.05501661744114585,0.0016662506519307997,0.10749324207653632,0.0}

      ,{0.13822895138139477,0.0,0.0,0.0,0.0}}
      CorrelatedRandomVectorGenerator.getRank() = 5

      CorrelatedRandomVectorGenerator.getRootMatrix() =
      Array2DRowRealMatrix{{0.0759577418122063,0.034512751379448724,0.0},

      {0.07764443622513505,0.13029949164628746,0.0}

      ,

      {0.0,0.0,0.0}

      ,

      {0.06662930527909404,0.023203936694855674,0.0}

      ,{0.13822895138139477,0.0,0.0}}
      CorrelatedRandomVectorGenerator.getRank() = 3

      CorrelatedRandomVectorGenerator.getRootMatrix() =
      Array2DRowRealMatrix{{0.0759577418122063,0.034512751379448724,0.033913748226348225,0.07303890149947785},

      {0.07764443622513505,0.13029949164628746,0.0,0.0}

      ,

      {0.06662930527909404,0.023203936694855674,0.11851573313229945,0.0}

      ,{0.13822895138139477,0.0,0.0,0.0}}
      CorrelatedRandomVectorGenerator.getRank() = 4

      Clearly, the rank of each of these matrices should be 4. The first matrix does not lead to incorrect results, but the second one does. Unfortunately, I don't know enough about the Cholesky decomposition to find the flaw in the implementation, and I could not find documentation for the "rectangular" variant (also not at the links provided in the javadoc).

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            gertvv Gert van Valkenhoef
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment