Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-12037

Reduce noise from flakey tests

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 7.2
    • None
    • Tests
    • None

    Description

      Recreating SOLR-12016. Please do NOT delete this without discussion. NOTE: Uwe's build system modifications originally on 12016 have been incorporated into SOLR-12028.

      Current situation concerns:
      > There is so much noise from flakey tests (particularly Solr tests) that they are difficult to use.
      > The number of tests that regularly fail is increasing
      > Failures are being ignored
      > The number of failing tests makes releasing more difficult.
      > The number of failing tests make it harder to determine whether recent changes actually caused problems. Running the tests again until they succeed is used commonly at present, which is not robust.
      > e-mail notifications of failing tests are largely being ignored.
      Propsal:
      > Mark all currently "flakey" tests as BadApple or AwaitsFix
      > Run Jenkins jobs with BadApple (and/or AwaitsFix) enabled and disabled. Frequency TBD, depends partly on whether we can label emails from these runs for easy filtering of the two flavors.
      >> Label these runs with something suitable in the subject line (wish list)
      > Weekly reports on the tests labeled BadApple or AwaitsFix
      >> Perhaps this could be incorporated in the reports linked below (wish list)
      > Committers should enable BadApple (or AwaitsFix) regularly as a sanity check. Leave these as defaults.
      > We start getting much more aggressive about not allowing new flakey tests.
      NOTE: It's perfectly acceptable to have failing flakey tests as long as someone is activey working on fixing them.
      Concerns with solution
      > Decreases test coverage
      > Decreases visibility of flakey tests, making fixing them less likely.
      > Some tools (see below) that report on bad tests will not see tests that are annotated with BadApple or AwaitsFix.
      > Running unit tests and reporting errors are being conflated
      To be decided:
      > Can we label e-mails with failing tests with something in the subject line identifying whether they were run with BadApple/Awaits fix enabled or disabled? Can someone volunteer?
      > Is there any difference between BadApple and AwaitsFix? If not should we deprecate one? I propose we just use AwaitsFix and deprecate BadApple.
      > Can the automated reports (see below) be enhanced to also report tests labeled BadApple or AwaitsFix?
      Useful tools:
      > Steve Rowe's work on a Jenkins job to reproduce test failures (LUCENE-8106)
      > Hoss has worked on aggregating all test failures from the 3 Jenkins systems (ASF, Policeman, and Steve's), downloading the test results & logs, and running some reports/stats on failures.
      >> http://fucit.org/solr-jenkins-reports/
      >> https://github.com/hossman/jenkins-reports/
      >> http://fucit.org/solr-jenkins-reports/failure-report.html
      I've assigned this JIRA to myslef, but all volunteers welcome, especially anything that changes the build system.....
      I've decided to make this a SOLR jira on the theory that most of the offending tests are in the Solr hive, any sub-tasks for touching the build system can go under LUCENE if wanted.
      Also, I expect to add the annotation to some more tests for a few days as infrequent failures occur. Once we have stability (defined by there being little noise) that'll stop.
      3 BadApple 23 AwaitsFix annotations are currently in the code, linked to these issues:

      HADOOP-9893
      LUCENE-3869
      LUCENE-5575")
      LUCENE-5595
      LUCENE-5737
      LUCENE-6709
      LUCENE-7161
      SOLR-2715
      SOLR-6213
      SOLR-6443
      SOLR-6944
      SOLR-10071
      SOLR-10136
      SOLR-10734
      SOLR-11974
      Solr JIRAS about bad tests
      SOLR-2175
      SOLR-4147
      SOLR-5880
      SOLR-6423
      SOLR-6944
      SOLR-6961
      SOLR-6974
      SOLR-8122
      SOLR-8182
      SOLR-9869
      SOLR-10053
      SOLR-10070
      SOLR-10071
      SOLR-10139
      SOLR-10287

      SOLR-11911

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              erickerickson Erick Erickson
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: