Details
-
Bug
-
Status: Resolved
-
Low
-
Resolution: Fixed
-
None
-
Correctness - Test Failure
-
Low
-
Low Hanging Fruit
-
Fuzz Test
-
All
-
None
-
Description
The following was seen in the accord branch, but the problem is found in trunk as well.
node1_isolatedExecutor:8: java.base@11.0.15/jdk.internal.misc.Unsafe.park(Native Method) java.base@11.0.15/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234) org.apache.cassandra.simulator.systems.InterceptorOfSystemMethods$None.parkNanos(InterceptorOfSystemMethods.java:373) org.apache.cassandra.simulator.systems.InterceptorOfSystemMethods$Global.parkNanos(InterceptorOfSystemMethods.java:166) java.base@11.0.15/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123) java.base@11.0.15/java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1454) org.apache.cassandra.utils.ExecutorUtils.awaitTerminationUntil(ExecutorUtils.java:110) org.apache.cassandra.utils.ExecutorUtils.awaitTermination(ExecutorUtils.java:100) org.apache.cassandra.concurrent.Stage.shutdownAndWait(Stage.java:195) org.apache.cassandra.distributed.impl.Instance.lambda$shutdown$44(Instance.java:975)
node1_MiscStage:1: java.base@11.0.15/jdk.internal.misc.Unsafe.park(Native Method) java.base@11.0.15/java.util.concurrent.locks.LockSupport.park(LockSupport.java:323) org.apache.cassandra.utils.concurrent.WaitQueue$Standard$AbstractSignal.await(WaitQueue.java:290) org.apache.cassandra.utils.concurrent.WaitQueue$Standard$AbstractSignal.await(WaitQueue.java:283) org.apache.cassandra.utils.concurrent.Awaitable$AsyncAwaitable.await(Awaitable.java:306) org.apache.cassandra.utils.concurrent.Awaitable$AsyncAwaitable.await(Awaitable.java:338) org.apache.cassandra.utils.concurrent.Awaitable$Defaults.awaitUninterruptibly(Awaitable.java:186) org.apache.cassandra.utils.concurrent.Awaitable$AbstractAwaitable.awaitUninterruptibly(Awaitable.java:259) org.apache.cassandra.tcm.log.LocalLog$Async.runOnce(LocalLog.java:710) org.apache.cassandra.tcm.log.LocalLog.runOnce(LocalLog.java:404) org.apache.cassandra.tcm.log.LocalLog.waitForHighestConsecutive(LocalLog.java:346) org.apache.cassandra.tcm.PaxosBackedProcessor.fetchLogAndWait(PaxosBackedProcessor.java:163) org.apache.cassandra.tcm.AbstractLocalProcessor.commit(AbstractLocalProcessor.java:109) org.apache.cassandra.distributed.test.log.TestProcessor.commit(TestProcessor.java:61) org.apache.cassandra.tcm.ClusterMetadataService$SwitchableProcessor.commit(ClusterMetadataService.java:841) org.apache.cassandra.tcm.Processor.commit(Processor.java:45) org.apache.cassandra.tcm.ClusterMetadataService.commit(ClusterMetadataService.java:516) org.apache.cassandra.service.accord.AccordFastPathCoordinator$Impl.lambda$updateFastPath$2(AccordFastPathCoordinator.java:208) org.apache.cassandra.service.accord.AccordFastPathCoordinator$Impl$$Lambda$11211/0x0000000802441840.run(Unknown Source)
Accord is trying to commit a new epoch, but TCM uses “awaitUninterruptibly” which ignores the thread interrupt done while the cluster is shutting down. When this is happening the instance is unable to make progress so loops endlessly, causing the test to fail to close.
Attachments
Attachments
Issue Links
- is related to
-
CASSANDRA-19847 Create a fuzz test that randomizes topology changes, cluster actions, and CQL operations
- Resolved
- links to