# [math] SimpleRegression getSumSquaredErrors

## Details

• Type: Bug
• Status: Closed
• Priority: Major
• Resolution: Fixed
• Affects Version/s: 1.1
• Fix Version/s: None
• Labels:
None
• Environment:

Operating System: Windows 2000
Platform: PC

## Description

getSumSquaredErrors returns -ve value. See test below:

public void testSimpleRegression() {
double[] y =

{ 8915.102, 8919.302, 8923.502}

;
double[] x =

{ 1.107178495, 1.107264895, 1.107351295}

;
double[] x2 =

{ 1.107178495E2, 1.107264895E2, 1.107351295E2}

;
SimpleRegression reg = new SimpleRegression();
for (int i = 0; i < x.length; i++)

assertTrue(reg.getSumSquaredErrors() >= 0.0); // OK
reg.clear();
for (int i = 0; i < x.length; i++)

assertTrue(reg.getSumSquaredErrors() >= 0.0); // FAIL

}

## Attachments

1. math-85.patch
1 kB
Luc Maisonobe

## Activity

Hide
Luc Maisonobe added a comment -

The problem is related to computation accuracy in a corner case.

The data (110.7178495, 8915.102), (110.7264895, 8919.302), (110.7351295, 8923.502) represent three points on a perfect straigth line, with the second point exactly at the middle of the two extreme points. In this case, the sum of the squares of the errors should be exactly 0 as all points lie exactly on the estimated line.

If instead of checking reg.getSumSquaredErrors() >= 0.0 I print the value, I get -7.105427357601002E-15 on my GNU/Linux box. This seems quite fair for me as the computation involves computing a subtraction close to 35.28 - 35.28, where both terms result from several former computations. This is consistent with double precision.

What we observe here is simply a cancellation effect on subtraction. The result is null in the first part of the test (where the x values are 100 times smaller), slightly negative in the second part. I think the null result in the first part is only good fortune (well, it is really related to the orders of magnitude involved: x^2, y^2 and xy).

I suggest to consider this is not a bug.
I will add a patch with a slightly modified test case in a few minutes.

Show
Luc Maisonobe added a comment - The problem is related to computation accuracy in a corner case. The data (110.7178495, 8915.102), (110.7264895, 8919.302), (110.7351295, 8923.502) represent three points on a perfect straigth line, with the second point exactly at the middle of the two extreme points. In this case, the sum of the squares of the errors should be exactly 0 as all points lie exactly on the estimated line. If instead of checking reg.getSumSquaredErrors() >= 0.0 I print the value, I get -7.105427357601002E-15 on my GNU/Linux box. This seems quite fair for me as the computation involves computing a subtraction close to 35.28 - 35.28, where both terms result from several former computations. This is consistent with double precision. What we observe here is simply a cancellation effect on subtraction. The result is null in the first part of the test (where the x values are 100 times smaller), slightly negative in the second part. I think the null result in the first part is only good fortune (well, it is really related to the orders of magnitude involved: x^2, y^2 and xy). I suggest to consider this is not a bug. I will add a patch with a slightly modified test case in a few minutes.
Hide
Luc Maisonobe added a comment -

patch adding a test case for issue MATH-85

Show
Luc Maisonobe added a comment - patch adding a test case for issue MATH-85
Hide
Phil Steitz added a comment -

I agree this is a corner case and the negative result is due to rounding. The question is, should we force the result to 0 when a negative value is returned by the computation?

Show
Phil Steitz added a comment - I agree this is a corner case and the negative result is due to rounding. The question is, should we force the result to 0 when a negative value is returned by the computation?
Hide
Phil Steitz added a comment -

Constrained returned result to be non-negative.

Show
Phil Steitz added a comment - Constrained returned result to be non-negative.

## People

• Assignee:
Unassigned
Reporter:
Mark Osborn