Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
On my laptop, TestSCMSafeModeManager.testSafeModePipelineExitRule() always takes just over 200 seconds. Checking a few PR runs, it does not seem to be slow on PRs, just locally.
Debugging the code, it seems to hang in the BackgroundPipelineCreator.stop() method, where it is waiting for the tread to join:
public void stop() { if (!running.compareAndSet(true, false)) { LOG.warn("{} is not running, just ignore.", THREAD_NAME); return; } LOG.info("Stopping {}.", THREAD_NAME); // in case RatisPipelineUtilsThread is sleeping synchronized (monitor) { monitor.notifyAll(); } try { thread.join(); // ----> Hangs here } catch (InterruptedException e) { LOG.warn("Interrupted during join {}.", THREAD_NAME); Thread.currentThread().interrupt(); } }
It is clearly hanging as the background thread did not exit, and I believe it is because `notify()` is being used to try to exit the thread, when it should really be interrupted. There is a chance that notify is called while the thread is not waiting, and if so, it will just fall back into the wait state and not exit until it wakes up again.
Attachments
Issue Links
- links to