Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
The test testSuspendLockingBlocksUntilNoLocks from class DistributedLockServiceDUnitTest failed twice in CI runs 967 and 969.
Results for the first failure are available here and for the second one here.
Archived artifacts for the first failure are available here and for the second one here.
The issue appears to be a race condition while firing an asynchronous thread on a remote VM through the following code:
VM vm1 = getVM(1); vm1.invokeAsync(new SerializableRunnable("Lock & unlock in vm1") { @Override public void run() { DistributedLockService service2 = getServiceNamed(name); assertThat(service2.lock("lock", -1, -1)).isTrue(); synchronized (monitor) { try { monitor.wait(); } catch (InterruptedException ex) { out.println("Unexpected InterruptedException"); fail("interrupted"); } } service2.unlock("lock"); } }); // Let vm1's thread get the lock and go into wait() sleep(100);
If the thread is not launched on the remote VM after sleeping for 100 milliseconds, the test will fail as the thread on the local VM will be able to invoke suspendLocking right away:
Thread thread = new Thread(new Runnable() { @Override public void run() { setGot(service.suspendLocking(-1)); setDone(true); service.resumeLocking(); } }); setGot(false); setDone(false); thread.start(); // Let thread start, make sure it's blocked in suspendLocking sleep(100); assertThat(getGot() || getDone()) .withFailMessage("Before release, got: " + getGot() + ", done: " + getDone()).isFalse();
Increasing the sleep time might help to reduce possible re occurrences of the issue, another option would be to investigate how to make the test wait unti the asynchronous invocation has been started on the remote VM instead of arbitrarily sleeping 100 milliseconds.
Attachments
Issue Links
- links to