When I measure the times in TestReplication.queueFailover, it takes about 15s on my (reasonably fast) Laptop.
The timeout in queueFailover currently is 1500*2*15 = 45000ms.
For setup before each test (which truncates the table and waits for the changes to replicate) it is 1500*15 = 22500ms.
Interestingly I see queueFailover failures where the wait time is measured as 64260ms and some at 72316ms.
Since these numbers are not even close to 45000ms, the machine or JVM must have been stuck for 15 or almost 30s (otherwise we'd get a timeout and the total time spent should be close to the timeout).
So I would suggest that we increase the timeouts further.
We could set SLEEP_TIME to 2000 and retries to 20. Would lead to 2000*2*20 = 80000ms.