Uploaded image for project: 'Commons Lang'
  1. Commons Lang
  2. LANG-592

RandomUtils tests are failing frequently



    • Test
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 2.5
    • lang.math.*
    • None


      The additionan of 40+ chi-square tests added for RandomUtils have caused the RandomUtilsTest to start failing frequently.

      Phil Steitz investigated this and wrote the following (http://markmail.org/message/mo4qb3qh75nq2kwn) on the mailing list:

      The random data tests are failing at a high enough frequency to be
      annoying / alarming to users.
      I investigated the high incidence of test failures and found nothing
      wrong with what the tests are doing and nothing to indicate
      systematic bias in the data being generated; but the addition of 40+
      chi-square tests in the test methods added in r907159 makes the
      probability of failure in a given run > 1/25.  This is why there is
      a high incidence of test failures.
      I verified that failures appear to be evenly distributed (too many,
      too few even/edd, too many, too few above/below range midpoints) and
      that the chisquare statistics are being computed correctly, with the
      right critical values applied.
      If you do cut another RC, I would recommend one of the following:
      1) Grab / copy and extend [math]'s RetryTestCase (will cut incidence
      of failure in half)
      2) Disable the stochastic test cases for the release
      3) Reduce sensitivity of the chi-square test (change to e.g., .0005
      level of significance)
      4) Reduce the number of tests
      My recommendation is 2) - leave in the source but comment out.  The
      tests are valuable as they would fail regularly and miserably if
      there were systematic bias (as there used to be on odd/even); but
      without reducing significantly the number of tests or the
      sensitivity (or limiting to a single "successful" PRNG sequence),
      there is no way to leave them all in without generating an
      annoyingly high rate of random failures.




            niallp Niall Pemberton
            niallp Niall Pemberton
            0 Vote for this issue
            0 Start watching this issue