Commons Math
  1. Commons Math
  2. MATH-718

inverseCumulativeProbability of BinomialDistribution returns wrong value for large trials.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.2, 3.0
    • Fix Version/s: 3.1
    • Labels:
      None

      Description

      The inverseCumulativeProbability method of the BinomialDistributionImpl class returns wrong value for large trials. Following code will be reproduce the problem.

      System.out.println(new BinomialDistributionImpl(1000000, 0.5).inverseCumulativeProbability(0.5));

      This returns 499525, though it should be 499999.

      I'm not sure how it should be fixed, but the cause is that the cumulativeProbability method returns Infinity, not NaN. As the result the checkedCumulativeProbability method doesn't work as expected.

      1. MATH-718.diff
        4 kB
        Thomas Neidhart

        Issue Links

          Activity

          Hide
          Sébastien Brisard added a comment - - edited

          Hi Yuji,
          thanks for reporting this. For version 3.0, we are currently reshaping the package distribution, and this will probably get resolved once we are over with MATH-692.
          Best regards,
          Sébastien

          Show
          Sébastien Brisard added a comment - - edited Hi Yuji, thanks for reporting this. For version 3.0, we are currently reshaping the package distribution, and this will probably get resolved once we are over with MATH-692 . Best regards, Sébastien
          Hide
          Christian Winter added a comment -

          There seem to be stability problems with Beta.regularizedBeta(...) when using extreme parameters. PascalDistribution.cumulativeProbability(Integer.MAX_VALUE) returns NaN instead of 1. We should look for a way to avoid both infinite values and NaNs in the implementation of the regularized beta function.

          Show
          Christian Winter added a comment - There seem to be stability problems with Beta.regularizedBeta(...) when using extreme parameters. PascalDistribution.cumulativeProbability(Integer.MAX_VALUE) returns NaN instead of 1. We should look for a way to avoid both infinite values and NaNs in the implementation of the regularized beta function.
          Hide
          Sébastien Brisard added a comment -

          As rightly pointed out by Christian, this issue is strongly related to MATH-738.

          Show
          Sébastien Brisard added a comment - As rightly pointed out by Christian, this issue is strongly related to MATH-738 .
          Hide
          Sébastien Brisard added a comment -

          Sorry, I only meant to postpone this issue.

          Show
          Sébastien Brisard added a comment - Sorry, I only meant to postpone this issue.
          Hide
          James Kaufman added a comment -

          I just wanted to let you know that our open source project (http://www.eclipse.org/stem/) needs this function and we are eagerly awaiting the update. We are experiencing the issue of wrong values for large trials.

          Show
          James Kaufman added a comment - I just wanted to let you know that our open source project ( http://www.eclipse.org/stem/ ) needs this function and we are eagerly awaiting the update. We are experiencing the issue of wrong values for large trials.
          Hide
          Sébastien Brisard added a comment -

          Hi James,
          thanks for your interest. STEM is a very interesting project!
          I will try and find a fix for this issue as soon as possible. Any ideas are welcome!
          Sébastien

          Show
          Sébastien Brisard added a comment - Hi James, thanks for your interest. STEM is a very interesting project! I will try and find a fix for this issue as soon as possible. Any ideas are welcome! Sébastien
          Hide
          Thomas Neidhart added a comment - - edited

          The problem Christian described wrt the PascalDistribution is a simple integer overflow in the class itself:

              public double cumulativeProbability(int x) {
                  double ret;
                  if (x < 0) {
                      ret = 0.0;
                  } else {
                      ret = Beta.regularizedBeta(probabilityOfSuccess,
                              numberOfSuccesses, x + 1);
                  }
                  return ret;
              }
          

          when x = Integer.MAX_VALUE, adding 1 to it will result in an overflow. As the parameter of regularizedBeta is anyway a double, so it should be changed to something like "1L + x" to enforce a long addition.

          Edit: Similar things happen btw also in other Distribution implementations, so it should be fixed also there, e.g. BinomialDistribution

          Show
          Thomas Neidhart added a comment - - edited The problem Christian described wrt the PascalDistribution is a simple integer overflow in the class itself: public double cumulativeProbability(int x) { double ret; if (x < 0) { ret = 0.0; } else { ret = Beta.regularizedBeta(probabilityOfSuccess, numberOfSuccesses, x + 1); } return ret; } when x = Integer.MAX_VALUE, adding 1 to it will result in an overflow. As the parameter of regularizedBeta is anyway a double, so it should be changed to something like "1L + x" to enforce a long addition. Edit: Similar things happen btw also in other Distribution implementations, so it should be fixed also there, e.g. BinomialDistribution
          Hide
          Thomas Neidhart added a comment -

          The problem is not only related to the Beta function, also the ContinuedFraction.evaluate is numerically unstable.

          The reason the cumulativeProbability returns infinity instead of NaN is because the evaluate return 0.0 when called from Beta.regularizedBeta, which leads to a division by zero. The used default epsilon of 10e-15 seems also quite restrictive, when relaxing the epsilon I got much better results (e.g. with 10e-5 I got a result of 499997).

          Show
          Thomas Neidhart added a comment - The problem is not only related to the Beta function, also the ContinuedFraction.evaluate is numerically unstable. The reason the cumulativeProbability returns infinity instead of NaN is because the evaluate return 0.0 when called from Beta.regularizedBeta, which leads to a division by zero. The used default epsilon of 10e-15 seems also quite restrictive, when relaxing the epsilon I got much better results (e.g. with 10e-5 I got a result of 499997).
          Hide
          Thomas Neidhart added a comment -

          I further looked into this with relation to MATH-785. First of all, in the original bug report, the reporter mentions that the expected result should be 499999 which is wrong, imho it should be 500000.

          After implementing the modified Lentz-Thompson algorithm, the results for the BinomialDistribution of large trials show correct results.

          Show
          Thomas Neidhart added a comment - I further looked into this with relation to MATH-785 . First of all, in the original bug report, the reporter mentions that the expected result should be 499999 which is wrong, imho it should be 500000. After implementing the modified Lentz-Thompson algorithm, the results for the BinomialDistribution of large trials show correct results.
          Hide
          Thomas Neidhart added a comment - - edited

          The attached diff file shows the (preliminary) implementation of the modified Lentz-Thompson algorithm.

          Edit: re-uploaded the diff file as it was broken.

          Edit2: the failing unit tests I mentioned before were due to a wrong loop, the latest diff shows no unit test errors.

          Show
          Thomas Neidhart added a comment - - edited The attached diff file shows the (preliminary) implementation of the modified Lentz-Thompson algorithm. Edit: re-uploaded the diff file as it was broken. Edit2: the failing unit tests I mentioned before were due to a wrong loop, the latest diff shows no unit test errors.
          Hide
          Thomas Neidhart added a comment -

          Fixed in r1341171.

          Show
          Thomas Neidhart added a comment - Fixed in r1341171.

            People

            • Assignee:
              Unassigned
              Reporter:
              Yuji Uchiyama
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development