Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Correctness - Test Failure
-
Normal
-
Normal
-
User Report
-
All
-
None
-
Description
More precisely, when the MessagingService version to VERSION_50, the test fails with an exception that appears to be a genuine problem:
junit.framework.AssertionFailedError: Exception found expected null, but was:<java.lang.RuntimeException: Did not get replies from all endpoints. at org.apache.cassandra.service.ActiveRepairService.lambda$prepareForRepair$2(ActiveRepairService.java:678) at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:833) > at org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129) at org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164) at org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124) at org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) org.apache.cassandra.distributed.shared.ShutdownException: Uncaught exceptions were thrown during test at org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117) at org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103) at org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) Suppressed: java.lang.IllegalStateException: complete already: (failure: java.lang.RuntimeException: Did not get replies from all endpoints.) at org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106) at org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721) at org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697) at org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58) at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) at org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64) at org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430) at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:833)
The updates to pending in ActiveRepairService are not concurrency-safe, but fixing them by doing e.g.
Index: src/java/org/apache/cassandra/service/ActiveRepairService.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 =================================================================== diff --git a/src/java/org/apache/cassandra/service/ActiveRepairService.java b/src/java/org/apache/cassandra/service/ActiveRepairService.java --- a/src/java/org/apache/cassandra/service/ActiveRepairService.java (revision 04552046f74f596e69e2d98c3f3e522fb5888c99) +++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java (date 1700839874092) @@ -675,7 +675,7 @@ if (promise.isDone()) return; String errorMsg = "Did not get replies from all endpoints."; - if (promise.tryFailure(new RuntimeException(errorMsg))) + if (pending.getAndSet(-1) > 0 && promise.tryFailure(new RuntimeException(errorMsg))) participateFailed(parentRepairSession, errorMsg); }, timeoutMillis, MILLISECONDS); @@ -703,8 +703,8 @@ failedNodes.add(from.toString()); if (failureReason == RequestFailureReason.TIMEOUT) { - pending.set(-1); - promise.setFailure(failRepairException(parentRepairSession, "Did not get replies from all endpoints.")); + if (pending.getAndSet(-1) > 0) + promise.setFailure(failRepairException(parentRepairSession, "Did not get replies from all endpoints.")); } else {
still results in a test failure:
java.lang.AssertionError: Exception found expected null, but was:<java.lang.RuntimeException: Did not get replies from all endpoints. at org.apache.cassandra.service.ActiveRepairService.lambda$prepareForRepair$2(ActiveRepairService.java:678) at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:829)> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotNull(Assert.java:755) at org.junit.Assert.assertNull(Assert.java:737) at org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129) at org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164) at org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124) at org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) at com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38) at com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11) at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35) at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:232) at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:55)
Attachments
Issue Links
- Discovered while testing
-
CASSANDRA-18753 Add an optimized default configuration to tests and make it available for new users
- Resolved
- links to