Uploaded image for project: 'Commons Math'
  1. Commons Math
  2. MATH-1110

OLSMultipleLinearRegression needs a way to specify non-zero singularity threshold when instantiating QRDecomposition

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.2
    • Fix Version/s: 3.3
    • Labels:
      None
    • Environment:

      Windows 7, jdk1.6.0_45

      Description

      OLSMultipleLinearRegression uses QRDecomposition to perform a least-squares solution. QRDecomposition has the capability to use a non-zero threshold for detecting when the design matrix is singular (see https://issues.apache.org/jira/browse/MATH-665, https://issues.apache.org/jira/browse/MATH-1024, https://issues.apache.org/jira/browse/MATH-1100, https://issues.apache.org/jira/browse/MATH-1101) but OLSMultipleLinearRegression does not use this capability and therefore always uses the default singularity test threshold of 0. This can lead to bad solutions (see in particular https://issues.apache.org/jira/browse/MATH-1101?focusedCommentId=13909750&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13909750) when a SingularMatrixException should instead be thrown.

      When I encountered this situation, I noticed it because the solution values were extremely large (in the range 1e09 - 1e12). Normal values in the domain I am working with are on the order of 1e-3. To find out why the values are so large, I traced through the source and found that an rDiag value was on the order of 1e-15, and that this passed the threshold test. I then noticed that two columns of the design matrix are linearly dependent (one column is all 1's because I want an intercept value in the solution, and another is also all 1's because that's how the data worked out). Thus the matrix is definitely singular.

      If I could specify a non-zero threshold, this situation would result in a SingularMatrixException, but without that, the bad solution values would be blindly propagated. That is a problem because this solution is intended for controlling a physical system, and damage could result from a bad solution.

      Unfortunately, I see no way to change the threshold value from outside – I would have to in effect re-implement OLSMultipleLinearRegression to do this as a user of the package.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              ejs Edward Segall
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: