Description
We have many Jenkins instances blasting tests, some official, some policeman, I and others have or had their own, and the email trail proves the power of the Jenkins cluster to find test fails.
However, I still have a very hard time with some basic questions:
what tests are flakey right now? which test fails actually affect devs most? did I break it? was that test already flakey? is that test still flakey? what are our worst tests right now? is that test getting better or worse?
We really need a way to see exactly what tests are the problem, not because of OS or environmental issues, but more basic test quality issues. Which tests are flakey and how flakey are they at any point in time.
Reports:
https://drive.google.com/drive/folders/0ByYyjsrbz7-qa2dOaU1UZDdRVzg?usp=sharing
01/24/2017 - https://docs.google.com/spreadsheets/d/1JySta2j2s7A_p16wA1UO-l6c4GsUHBIb4FONS2EzW9k/edit?usp=sharing
02/01/2017 - https://docs.google.com/spreadsheets/d/1FndoyHmihaOVL2o_Zns5alpNdAJlNsEwQVoJ4XDWj3c/edit?usp=sharing
02/08/2017 - https://docs.google.com/spreadsheets/d/1N6RxH4Edd7ldRIaVfin0si-uSLGyowQi8-7mcux27S0/edit?usp=sharing
02/14/2017 - https://docs.google.com/spreadsheets/d/1eZ9_ds_0XyqsKKp8xkmESrcMZRP85jTxSKkNwgtcUn0/edit?usp=sharing
02/17/2017 - https://docs.google.com/spreadsheets/d/1LEPvXbsoHtKfIcZCJZ3_P6OHp7S5g2HP2OJgU6B2sAg/edit?usp=sharing
Attachments
Attachments
Issue Links
- incorporates
-
SOLR-10079 TestInPlaceUpdates(Distrib|Standalone) failures
- Resolved
-
SOLR-10063 CoreContainer shutdown has race condition that can cause a hang on shutdown.
- Closed
-
SOLR-10101 TestLazyCores hangs.
- Closed
-
SOLR-10069 The Nightly test CdcrReplicationDistributedZkTest appears to be too fragile.
- Open
-
SOLR-10071 The Nightly test LeaderInitiatedRecoveryOnShardRestartTest appears to be too fragile.
- Open
-
SOLR-10139 ShardSplitTest needs to be hardened.
- Open
-
SOLR-10161 HdfsChaosMonkeySafeLeaderTest needs to be hardened.
- Open
-
SOLR-10162 HttpPartitionTest needs to be hardened.
- Open
-
SOLR-10165 MissingSegmentRecoveryTest needs to be hardened.
- Open
-
SOLR-10195 Harden AbstractSolrMorphlineZkTestBase based tests.
- Open
-
SOLR-10253 Make tests that are as expensive as our expensive @Nightlys @Nightly themselves.
- Open
-
SOLR-10125 CollectionsAPIDistributedZkTest is too fragile.
- Reopened
-
SOLR-10070 The test HdfsDirectoryFactoryTest appears to be unreliable.
- Resolved
-
SOLR-10107 CdcrReplicationDistributedZkTest fails far too often and is an extremely expensive test, even when compared to other nightlies.
- Resolved
-
SOLR-10119 TestReplicationHandler has always been too good at finding too many annoying bugs.
- Resolved
-
SOLR-10126 PeerSyncReplicationTest is a flakey test.
- Resolved
-
SOLR-10191 HdfsChaosMonkeyNothingIsSafeTest needs to be hardened.
- Resolved
-
SOLR-10066 The Nightly test HdfsWriteToMultipleCollectionsTest appears to be too fragile.
- Closed
-
SOLR-10064 The Nightly test HdfsCollectionsAPIDistributedZkTest appears to be too fragile.
- Closed
-
SOLR-10065 The Nightly test ConcurrentDeleteAndCreateCollectionTest appears to be too fragile.
- Closed
-
SOLR-10067 The Nightly test HdfsBasicDistributedZkTest appears to be too fragile.
- Closed
-
SOLR-10068 The Nightly test SharedFSAutoReplicaFailoverTest appears to be too fragile.
- Closed
-
SOLR-10072 The test TestSelectiveWeightCreation appears to be unreliable.
- Closed
-
SOLR-10098 HdfsThreadLeakTest and HdfsRecoverLeaseTest can leak threads
- Closed
-
SOLR-10109 SoftAutoCommitTest is too fragile.
- Closed
-
SOLR-10127 OverseerRolesTest needs to be hardened.
- Closed
-
SOLR-10160 HdfsTlogReplayBufferedWhileIndexingTest needs to be hardened.
- Closed
-
SOLR-10164 DistributedVersionInfoTest needs to be hardened.
- Closed
-
SOLR-10166 TestLBHttpSolrClient needs to be hardened.
- Closed
- is depended upon by
-
SOLR-12052 Schedule a BeastIt unit test report to be published every week.
- Open
- is duplicated by
-
SOLR-7354 Solr Jenkins test fails are out of control.
- Closed
- is related to
-
SOLR-8742 HdfsDirectoryTest fails reliably after changes in LUCENE-6932
- Closed
-
SOLR-9903 Stop interrupting the update executor on shutdown, it can cause graceful shutdowns to put replicas into Leader Initiated Recovery among other undesirable things.
- Closed
-
SOLR-10169 PeerSync will hit an NPE on no response errors when looking for fingerprint.
- Closed
-
SOLR-10077 TestManagedFeatureStore extends LuceneTestCase, but has no tests and just hosts a static method.
- Resolved
-
SOLR-10074 TestConfig appears to be incompatible with custom ant test location properties that should be supported.
- Closed
-
SOLR-10075 Move assumes in TestNonWritablePersistFile to BeforeClass method.
- Closed
-
SOLR-10193 Improve MiniSolrCloudCluster#shutdown.
- Closed
- relates to
-
SOLR-9401 TestPKIAuthenticationPlugin NPE
- Closed
-
SOLR-10196 ElectionContext#runLeaderProcess can hit NPE on core close.
- Closed
-
SOLR-10248 Merge SolrTestCaseJ4's SolrIndexSearcher tracking into the ObjectReleaseTracker.
- Open
-
SOLR-10053 TestSolrCloudWithDelegationTokens failures
- Resolved
-
SOLR-10136 TestReqParamsAPI regularly fails on Policeman Jenkins
- Resolved
-
SOLR-10175 replace @Ignore for TestAnalyticsQParserPlugin
- Resolved
-
SOLR-10024 If you use ExternalPaths#determineSourceHome with a custom tests.workDir, tests may not find the path.
- Closed
-
SOLR-10073 TestCoreDiscovery appears to be incompatible with custom ant test location properties that should be supported.
- Closed
-
SOLR-10111 MBeansHandlerTest.testDiff regularly fails
- Closed
-
SOLR-10142 replace @Ignore/@AwaitsFix for TestClassNameShortening
- Closed
-
SOLR-10176 add @Ignore to forbiddenApis/solr.txt
- Open
I'll create sub issues off this JIRA to address the worst tests I find.