Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.0
-
None
-
None
-
JDK 1.6 / Eclipse Indigo on Ubuntu 10.04
Description
The following three matrices (which are basically permutations of each other) produce different results when sampling a multi-variate Gaussian with the help of CorrelatedRandomVectorGenerator (sample covariances calculated in R, based on 10,000 samples):
Array2DRowRealMatrix
{ {0.0,0.0,0.0,0.0,0.0},
{0.0,0.013445532,0.01039469,0.009881156,0.010499559},
{0.0,0.01039469,0.023006616,0.008196856,0.010732709},
{0.0,0.009881156,0.008196856,0.019023866,0.009210099},
{0.0,0.010499559,0.010732709,0.009210099,0.019107243}}
> cov(data1)
V1 V2 V3 V4 V5
V1 0 0.000000000 0.00000000 0.000000000 0.000000000
V2 0 0.013383931 0.01034401 0.009913271 0.010506733
V3 0 0.010344006 0.02309479 0.008374730 0.010759306
V4 0 0.009913271 0.00837473 0.019005488 0.009187287
V5 0 0.010506733 0.01075931 0.009187287 0.019021483
Array2DRowRealMatrix
{ {0.013445532,0.01039469,0.0,0.009881156,0.010499559},
{0.01039469,0.023006616,0.0,0.008196856,0.010732709},
{0.0,0.0,0.0,0.0,0.0}, {0.009881156,0.008196856,0.0,0.019023866,0.009210099},{0.010499559,0.010732709,0.0,0.009210099,0.019107243}}
> cov(data2)
V1 V2 V3 V4 V5
V1 0.006922905 0.010507692 0 0.005817399 0.010330529
V2 0.010507692 0.023428918 0 0.008273152 0.010735568
V3 0.000000000 0.000000000 0 0.000000000 0.000000000
V4 0.005817399 0.008273152 0 0.004929843 0.009048759
V5 0.010330529 0.010735568 0 0.009048759 0.018683544
Array2DRowRealMatrix{ {0.013445532,0.01039469,0.009881156,0.010499559}, {0.01039469,0.023006616,0.008196856,0.010732709}, {0.009881156,0.008196856,0.019023866,0.009210099},
{0.010499559,0.010732709,0.009210099,0.019107243}}
> cov(data3)
V1 V2 V3 V4
V1 0.013445047 0.010478862 0.009955904 0.010529542
V2 0.010478862 0.022910522 0.008610113 0.011046353
V3 0.009955904 0.008610113 0.019250975 0.009464442
V4 0.010529542 0.011046353 0.009464442 0.019260317
I've traced this back to the RectangularCholeskyDecomposition, which does not seem to handle the second matrix very well (decompositions in the same order as the matrices above):
CorrelatedRandomVectorGenerator.getRootMatrix() =
Array2DRowRealMatrix{{0.0,0.0,0.0,0.0,0.0}
,
{0.0759577418122063,0.0876125188474239,0.0,0.0,0.0},
{0.07764443622513505,0.05132821221460752,0.11976381821791235,0.0,0.0},
{0.06662930527909404,0.05501661744114585,0.0016662506519307997,0.10749324207653632,0.0},{0.13822895138139477,0.0,0.0,0.0,0.0}}
CorrelatedRandomVectorGenerator.getRank() = 5
CorrelatedRandomVectorGenerator.getRootMatrix() =
Array2DRowRealMatrix{{0.0759577418122063,0.034512751379448724,0.0},
,
{0.0,0.0,0.0},
{0.06662930527909404,0.023203936694855674,0.0},{0.13822895138139477,0.0,0.0}}
CorrelatedRandomVectorGenerator.getRank() = 3
CorrelatedRandomVectorGenerator.getRootMatrix() =
Array2DRowRealMatrix{{0.0759577418122063,0.034512751379448724,0.033913748226348225,0.07303890149947785},
,
{0.06662930527909404,0.023203936694855674,0.11851573313229945,0.0},{0.13822895138139477,0.0,0.0,0.0}}
CorrelatedRandomVectorGenerator.getRank() = 4
Clearly, the rank of each of these matrices should be 4. The first matrix does not lead to incorrect results, but the second one does. Unfortunately, I don't know enough about the Cholesky decomposition to find the flaw in the implementation, and I could not find documentation for the "rectangular" variant (also not at the links provided in the javadoc).