Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
3.2
-
None
-
None
-
Windows 7, jdk1.6.0_45
Description
OLSMultipleLinearRegression uses QRDecomposition to perform a least-squares solution. QRDecomposition has the capability to use a non-zero threshold for detecting when the design matrix is singular (see https://issues.apache.org/jira/browse/MATH-665, https://issues.apache.org/jira/browse/MATH-1024, https://issues.apache.org/jira/browse/MATH-1100, https://issues.apache.org/jira/browse/MATH-1101) but OLSMultipleLinearRegression does not use this capability and therefore always uses the default singularity test threshold of 0. This can lead to bad solutions (see in particular https://issues.apache.org/jira/browse/MATH-1101?focusedCommentId=13909750&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13909750) when a SingularMatrixException should instead be thrown.
When I encountered this situation, I noticed it because the solution values were extremely large (in the range 1e09 - 1e12). Normal values in the domain I am working with are on the order of 1e-3. To find out why the values are so large, I traced through the source and found that an rDiag value was on the order of 1e-15, and that this passed the threshold test. I then noticed that two columns of the design matrix are linearly dependent (one column is all 1's because I want an intercept value in the solution, and another is also all 1's because that's how the data worked out). Thus the matrix is definitely singular.
If I could specify a non-zero threshold, this situation would result in a SingularMatrixException, but without that, the bad solution values would be blindly propagated. That is a problem because this solution is intended for controlling a physical system, and damage could result from a bad solution.
Unfortunately, I see no way to change the threshold value from outside – I would have to in effect re-implement OLSMultipleLinearRegression to do this as a user of the package.