Commons Math
  1. Commons Math
  2. MATH-85

[math] SimpleRegression getSumSquaredErrors

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.1
    • Fix Version/s: None
    • Labels:
      None
    • Environment:

      Operating System: Windows 2000
      Platform: PC

      Description

      getSumSquaredErrors returns -ve value. See test below:

      public void testSimpleRegression() {
      double[] y =

      { 8915.102, 8919.302, 8923.502}

      ;
      double[] x =

      { 1.107178495, 1.107264895, 1.107351295}

      ;
      double[] x2 =

      { 1.107178495E2, 1.107264895E2, 1.107351295E2}

      ;
      SimpleRegression reg = new SimpleRegression();
      for (int i = 0; i < x.length; i++)

      { reg.addData(x[i],y[i]); }

      assertTrue(reg.getSumSquaredErrors() >= 0.0); // OK
      reg.clear();
      for (int i = 0; i < x.length; i++)

      { reg.addData(x2[i],y[i]); }

      assertTrue(reg.getSumSquaredErrors() >= 0.0); // FAIL

      }

      1. math-85.patch
        1 kB
        Luc Maisonobe

        Activity

        Hide
        Luc Maisonobe added a comment -

        The problem is related to computation accuracy in a corner case.

        The data (110.7178495, 8915.102), (110.7264895, 8919.302), (110.7351295, 8923.502) represent three points on a perfect straigth line, with the second point exactly at the middle of the two extreme points. In this case, the sum of the squares of the errors should be exactly 0 as all points lie exactly on the estimated line.

        If instead of checking reg.getSumSquaredErrors() >= 0.0 I print the value, I get -7.105427357601002E-15 on my GNU/Linux box. This seems quite fair for me as the computation involves computing a subtraction close to 35.28 - 35.28, where both terms result from several former computations. This is consistent with double precision.

        What we observe here is simply a cancellation effect on subtraction. The result is null in the first part of the test (where the x values are 100 times smaller), slightly negative in the second part. I think the null result in the first part is only good fortune (well, it is really related to the orders of magnitude involved: x^2, y^2 and xy).

        I suggest to consider this is not a bug.
        I will add a patch with a slightly modified test case in a few minutes.

        Show
        Luc Maisonobe added a comment - The problem is related to computation accuracy in a corner case. The data (110.7178495, 8915.102), (110.7264895, 8919.302), (110.7351295, 8923.502) represent three points on a perfect straigth line, with the second point exactly at the middle of the two extreme points. In this case, the sum of the squares of the errors should be exactly 0 as all points lie exactly on the estimated line. If instead of checking reg.getSumSquaredErrors() >= 0.0 I print the value, I get -7.105427357601002E-15 on my GNU/Linux box. This seems quite fair for me as the computation involves computing a subtraction close to 35.28 - 35.28, where both terms result from several former computations. This is consistent with double precision. What we observe here is simply a cancellation effect on subtraction. The result is null in the first part of the test (where the x values are 100 times smaller), slightly negative in the second part. I think the null result in the first part is only good fortune (well, it is really related to the orders of magnitude involved: x^2, y^2 and xy). I suggest to consider this is not a bug. I will add a patch with a slightly modified test case in a few minutes.
        Hide
        Luc Maisonobe added a comment -

        patch adding a test case for issue MATH-85

        Show
        Luc Maisonobe added a comment - patch adding a test case for issue MATH-85
        Hide
        Phil Steitz added a comment -

        I agree this is a corner case and the negative result is due to rounding. The question is, should we force the result to 0 when a negative value is returned by the computation?

        Show
        Phil Steitz added a comment - I agree this is a corner case and the negative result is due to rounding. The question is, should we force the result to 0 when a negative value is returned by the computation?
        Hide
        Phil Steitz added a comment -

        Constrained returned result to be non-negative.

        Show
        Phil Steitz added a comment - Constrained returned result to be non-negative.

          People

          • Assignee:
            Unassigned
            Reporter:
            Mark Osborn
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development