While working on LOG4J2-2653, I've come across some tests that rely on the use of Thread.sleep to test effects of some (typically concurrent) code. For example, many of the rolling file appender tests rely on this for checking the results of log file rollover/compression/deletes/etc. These tests tend to be the worst offenders in execution time by about 1-3 orders of magnitude compared to typical tests (e.g., most tests execute in a fraction of a second while these sleep-inducing tests can take several seconds each).
Interestingly enough, these tests are also typically the same ones that sporadically fail in CI due to timing issues. The systems under test are also typically some of the less well specified (from a formal standpoint) that receive a decent amount of bug reports and bug fixes. The difficulty of testing concurrent or parallel code has exacerbated the issue.
Tests relying on Thread.sleep should be refactored where possible to use more appropriate concurrency APIs. For example, instead of busy-waiting in a loop waiting for a condition to appear, this can use concurrency primitives like CountDownLatch or CyclicBarrier. This may require the introduction of some package-private test hooks in various plugins to coordinate them in a test, or it may involve the introduction of natural APIs in Log4j itself (like ReliabilityStrategy).
For tests that rely on the current time and such, these should be updated to use the configured Log4j Clock which can use a mock version in tests rather than testing the real clock (which just wastes time). There's very little reason for unit tests to be testing the passage of real time.