We have many Jenkins instances blasting tests, some official, some policeman, I and others have or had their own, and the email trail proves the power of the Jenkins cluster to find test fails.
However, I still have a very hard time with some basic questions:
what tests are flakey right now? which test fails actually affect devs most? did I break it? was that test already flakey? is that test still flakey? what are our worst tests right now? is that test getting better or worse?
We really need a way to see exactly what tests are the problem, not because of OS or environmental issues, but more basic test quality issues. Which tests are flakey and how flakey are they at any point in time.
01/24/2017 - https://docs.google.com/spreadsheets/d/1JySta2j2s7A_p16wA1UO-l6c4GsUHBIb4FONS2EzW9k/edit?usp=sharing
02/01/2017 - https://docs.google.com/spreadsheets/d/1FndoyHmihaOVL2o_Zns5alpNdAJlNsEwQVoJ4XDWj3c/edit?usp=sharing
02/08/2017 - https://docs.google.com/spreadsheets/d/1N6RxH4Edd7ldRIaVfin0si-uSLGyowQi8-7mcux27S0/edit?usp=sharing
02/14/2017 - https://docs.google.com/spreadsheets/d/1eZ9_ds_0XyqsKKp8xkmESrcMZRP85jTxSKkNwgtcUn0/edit?usp=sharing
02/17/2017 - https://docs.google.com/spreadsheets/d/1LEPvXbsoHtKfIcZCJZ3_P6OHp7S5g2HP2OJgU6B2sAg/edit?usp=sharing