Details

Type: Bug

Status: Open

Priority: Major

Resolution: Unresolved

Affects Version/s: 3.0

Fix Version/s: 4.0

Labels:
Description
This was first reported in MATH718. The result of the current implementation of the incomplete beta function I(x, a, b) is inaccurate when a and/or b are largeish.
I've skimmed through slatec, GSL, Boost as well as NR. At first sight, neither uses the same method to compute this function. I think TOMS708 is probably the best option.
Activity
 All
 Comments
 Work Log
 History
 Activity
 Transitions
I don't think these improvements are sufficient (the issue is really with the incomplete beta function, while I started with the beta function). The accuracy of the incomplete beta function should first be assessed (for example with the tools I have provided). In any case, this issue cannot be solved until 4.0, since there is a deprecated funtion which cannot be removed before 4.0.
Improving the incomplete beta function is a lot of work. I have provided the references in the javadoc (mainly the NSWC FORTRAN library). As I have already indicated, I no longer have time to do it myself (I'm sorry I forgot to remove myself as an assignee for this ticket). If someone is willing to take over, I'm happy to provide some help and references, but I just can't find the time to do the job. Hope you will understand.
Could we consider the latest improvements fixed the issue?
If further improvements are needed, a new issue could be opened.
In r1415853, committed new implementation of Beta.logBeta(double, double). The accuracy of the new implementation is reported in the table below. This change of the code leads to the deprecation of Beta.logBeta(double, double, double, int), as the computation is no longer iterative.
Interval  Values tested  Average error  Standard deviation  Maximum error 

0 < x ≤ 8, 0 < y ≤ 8  x[i] = i / 32, i = 1, ..., 256, y[j] = j / 32, j = 1, ..., 256  1.80 ulps  81.08 ulps  14031.0 ulps 
0 < x ≤ 8, 8 < y ≤ 16  x[i] = i / 32, i = 1, ..., 256, y[j] = j / 32, j = 257, ..., 512  0.50 ulps  3.64 ulps  694.0 ulps 
0 < x ≤ 8, 16 < y ≤ 256  x[i] = i / 32, i = 1, ..., 256, y[j] = j, j = 17, ..., 256  1.04 ulps  139.32 ulps  34509.0 ulps 
8 < x ≤ 16, 8 < y ≤ 16  x[i] = i / 32, i = 257, ..., 512, y[j] = j / 32, j = 257, ..., 512  0.35 ulps  0.48 ulps  2.0 ulps 
8 < x ≤ 16, 16 < y ≤ 256  x[i] = i / 32, i = 257, ..., 512, y[j] = j, j = 17, ..., 256  0.31 ulps  0.47 ulps  2.0 ulps 
16 < x ≤ 256, 16 < y ≤ 256  x[i] = i, i = 17, ..., 256, y[j] = j, j = 17, ..., 256  0.35 ulps  0.49 ulps  2.0 ulps 
In r1413369, implemented Beta.bcorr(double, double), an auxiliary function for the computation of Beta.beta(double, double).
In r1413366, implemented Gamma.logGammaMinusLogGammaSum(double, double), which maps (a, b) to log[Γ(b) / Γ(a + b)].
In r1411387, implemented Gamma.logGammaSum, which maps (a, b) to log(Gamma(a + b)).
Starting first with Beta.logBeta(double, double), accuracy of the current r implementation as reported in the users guide is
Interval  Values tested  Average error  Standard deviation  Maximum error 

0 < x ≤ 8, 0 < y ≤ 8  x[i] = i / 32, i = 1, ..., 256, y[j] = j / 32, j = 1, ..., 256  5.04 ulps  270.99 ulps  46696.0 ulps 
0 < x ≤ 8, 8 < y ≤ 16  x[i] = i / 32, i = 1, ..., 256, y[j] = j / 32, j = 257, ..., 512  9.75 ulps  149.42 ulps  19126.0 ulps 
0 < x ≤ 8, 16 < y ≤ 256  x[i] = i / 32, i = 1, ..., 256, y[j] = j, j = 17, ..., 256  357.82 ulps  39297.58 ulps  8635522.0 ulps 
8 < x ≤ 16, 8 < y ≤ 16  x[i] = i / 32, i = 257, ..., 512, y[j] = j / 32, j = 257, ..., 512  2.37 ulps  13.0 ulps  1.9 ulps 
8 < x ≤ 16, 16 < y ≤ 256  x[i] = i / 32, i = 257, ..., 512, y[j] = j, j = 17, ..., 256  10.75 ulps  9.74 ulps  73.0 ulps 
16 < x ≤ 256, 16 < y ≤ 256  x[i] = i, i = 17, ..., 256, y[j] = j, j = 17, ..., 256  5.20 ulps  4.33 ulps  53.0 ulps 
As of r1407592, programs and data files have been committed to /src/test/maxima to assess the accuracy of the current implementation. The users guide reports on this accuracy.
Hi Sebastien,
that sounds very good, I am looking forward to your code and will at least review it!
Hi Thomas,
thanks for proposing your help. Please wait a little, as I have already implemented some more methods, not yet committed.
For the time being, I'm working on a Java app to automatically assess the accuracy of implementation of special functions. I need to write a small README to explain how it works, then I'll commit the whole lot. I also have reimplemented logBeta. I will provide all these soon, but I am drowned in work for the time being.
I'll try to commit next weekend. Thanks for your patience!
BTW: be careful with TOMS code, which is copyrighted. I based my implementations on the NSWC library, which is very (very) similar, as it's the same author, but is not copyrighted.
Could you provide the scripts/programs you used for testing as a patch?
I may be able to help with the implementation of the TOMS708 algorithm.
Here are the results of some error tests I've carried out, based on reference values computed with Maxima and 64 decimal digits. The statistics correspond to errors in ulps.
***** a = 1.0, b = 1.0 ***** SummaryStatistics: n: 129 min: 0.0 max: 9.0 mean: 3.9534883720930214 geometric mean: 0.0 variance: 5.700944767441859 sum of squares: 2746.0 standard deviation: 2.38766512883232 sum of logs: Infinity ***** a = 1.0, b = 10.0 ***** SummaryStatistics: n: 129 min: 0.0 max: 55.0 mean: 7.08527131782946 geometric mean: 0.0 variance: 217.3911094961241 sum of squares: 34302.0 standard deviation: 14.744189007745529 sum of logs: Infinity ***** a = 1.0, b = 100.0 ***** SummaryStatistics: n: 129 min: 0.0 max: 134.0 mean: 2.139534883720932 geometric mean: 0.0 variance: 203.46475290697674 sum of squares: 26634.0 standard deviation: 14.264107154216724 sum of logs: Infinity ***** a = 10.0, b = 1.0 ***** SummaryStatistics: n: 129 min: 0.0 max: 233.0 mean: 40.00000000000001 geometric mean: 0.0 variance: 923.1249999999999 sum of squares: 324560.0 standard deviation: 30.38297220483868 sum of logs: Infinity ***** a = 10.0, b = 10.0 ***** SummaryStatistics: n: 129 min: 0.0 max: 116.0 mean: 29.131782945736425 geometric mean: 0.0 variance: 729.1465600775194 sum of squares: 202808.0 standard deviation: 27.00271393911211 sum of logs: Infinity ***** a = 10.0, b = 100.0 ***** SummaryStatistics: n: 129 min: 0.0 max: 105.0 mean: 7.891472868217045 geometric mean: 0.0 variance: 495.05062984496084 sum of squares: 71400.0 standard deviation: 22.24973325334398 sum of logs: Infinity ***** a = 100.0, b = 1.0 ***** SummaryStatistics: n: 129 min: 0.0 max: 474.0 mean: 162.31007751937983 geometric mean: 0.0 variance: 8511.274194525193 sum of squares: 4487891.5 standard deviation: 92.25656721624317 sum of logs: Infinity ***** a = 100.0, b = 10.0 ***** SummaryStatistics: n: 129 min: 0.0 max: 364.0 mean: 113.30232558139537 geometric mean: 0.0 variance: 6252.290697674416 sum of squares: 2456320.0 standard deviation: 79.0714278211442 sum of logs: Infinity ***** a = 100.0, b = 100.0 ***** SummaryStatistics: n: 129 min: 0.0 max: 1447.0 mean: 464.9999999999998 geometric mean: 0.0 variance: 221595.28124999997 sum of squares: 5.6257221E7 standard deviation: 470.7390797989901 sum of logs: Infinity
The situation is not too bad, but values claimed by BOOST are much better. So it might be worth having a look at the implementation proposed in TOMS708.
The deprecated method has been removed for 4.0.