Details
-
Test
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
-
None
Description
The additionan of 40+ chi-square tests added for RandomUtils have caused the RandomUtilsTest to start failing frequently.
Phil Steitz investigated this and wrote the following (http://markmail.org/message/mo4qb3qh75nq2kwn) on the mailing list:
The random data tests are failing at a high enough frequency to be annoying / alarming to users. I investigated the high incidence of test failures and found nothing wrong with what the tests are doing and nothing to indicate systematic bias in the data being generated; but the addition of 40+ chi-square tests in the test methods added in r907159 makes the probability of failure in a given run > 1/25. This is why there is a high incidence of test failures. I verified that failures appear to be evenly distributed (too many, too few even/edd, too many, too few above/below range midpoints) and that the chisquare statistics are being computed correctly, with the right critical values applied. If you do cut another RC, I would recommend one of the following: 1) Grab / copy and extend [math]'s RetryTestCase (will cut incidence of failure in half) 2) Disable the stochastic test cases for the release 3) Reduce sensitivity of the chi-square test (change to e.g., .0005 level of significance) 4) Reduce the number of tests My recommendation is 2) - leave in the source but comment out. The tests are valuable as they would fail regularly and miserably if there were systematic bias (as there used to be on odd/even); but without reducing significantly the number of tests or the sensitivity (or limiting to a single "successful" PRNG sequence), there is no way to leave them all in without generating an annoyingly high rate of random failures.