Details

Type: Bug

Status: Closed

Priority: Major

Resolution: Fixed

Affects Version/s: 1.1

Fix Version/s: None

Labels:None

Environment:
Operating System: Windows 2000
Platform: PC
Description
getSumSquaredErrors returns ve value. See test below:
public void testSimpleRegression() {
double[] y =
;
double[] x =
;
double[] x2 =
;
SimpleRegression reg = new SimpleRegression();
for (int i = 0; i < x.length; i++)
assertTrue(reg.getSumSquaredErrors() >= 0.0); // OK
reg.clear();
for (int i = 0; i < x.length; i++)
assertTrue(reg.getSumSquaredErrors() >= 0.0); // FAIL
}
The problem is related to computation accuracy in a corner case.
The data (110.7178495, 8915.102), (110.7264895, 8919.302), (110.7351295, 8923.502) represent three points on a perfect straigth line, with the second point exactly at the middle of the two extreme points. In this case, the sum of the squares of the errors should be exactly 0 as all points lie exactly on the estimated line.
If instead of checking reg.getSumSquaredErrors() >= 0.0 I print the value, I get 7.105427357601002E15 on my GNU/Linux box. This seems quite fair for me as the computation involves computing a subtraction close to 35.28  35.28, where both terms result from several former computations. This is consistent with double precision.
What we observe here is simply a cancellation effect on subtraction. The result is null in the first part of the test (where the x values are 100 times smaller), slightly negative in the second part. I think the null result in the first part is only good fortune (well, it is really related to the orders of magnitude involved: x^2, y^2 and xy).
I suggest to consider this is not a bug.
I will add a patch with a slightly modified test case in a few minutes.