Since the change to async logger, this test fails 100% of the time for me locally. steve_rowe confirmed that it fails for him too so it's not something weird in my environment.
Why there are no reports from Jenkins of this failure is a mystery.
The test succeeds, but there is a thread leak and the stack trace points to "lmax.disruptor" which is certainly a part of the async logging.
And why, if this is something generic, it doesn't fail many many tests is another mystery so I suspect it's something specific to the test, perhaps having to do with mocking the metric reporter, but that's a wild shot in the dark.
My working hypothesis is that "something isn't being closed/shutdown correctly", but that's...a little vague.
Sep 04, 2018 9:38:00 AM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks WARNING: Will linger awaiting termination of 1 leaked thread(s). Sep 04, 2018 9:38:20 AM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks SEVERE: 1 thread leaked from SUITE scope at org.apache.solr.metrics.SolrMetricReporterTest: 1) Thread[id=14, name=Log4j2-TF-1-AsyncLoggerConfig--1, state=TIMED_WAITING, group=TGRP-SolrMetricReporterTest] at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at com.lmax.disruptor.TimeoutBlockingWaitStrategy.waitFor(TimeoutBlockingWaitStrategy.java:38) at com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:56) at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128) at java.lang.Thread.run(Thread.java:748)