Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
We've seen this test fail in CI with the following stack trace:
ParallelWANPersistenceEnabledGatewaySenderDUnitTest > testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived FAILED java.lang.AssertionError: java.util.concurrent.TimeoutException: Timed out waiting 300000 milliseconds for AsyncInvocation to complete. at org.apache.geode.test.dunit.AsyncInvocation.await(AsyncInvocation.java:180) at org.apache.geode.internal.cache.wan.WANTestBase.startSenderwithCleanQueuesInVMsAsync(WANTestBase.java:1100) at org.apache.geode.internal.cache.wan.parallel.ParallelWANPersistenceEnabledGatewaySenderDUnitTest.testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived(ParallelWANPersistenceEnabledGatewaySenderDUnitTest.java:1780) Caused by: java.util.concurrent.TimeoutException: Timed out waiting 300000 milliseconds for AsyncInvocation to complete. at org.apache.geode.test.dunit.AsyncInvocation.timeoutIfAlive(AsyncInvocation.java:509) at org.apache.geode.test.dunit.AsyncInvocation.await(AsyncInvocation.java:447) at org.apache.geode.test.dunit.AsyncInvocation.await(AsyncInvocation.java:178) ... 2 more Caused by: org.apache.geode.test.dunit.internal.StackTrace: Stack trace for vm-4 thread-33 at java.lang.Thread.sleep(Native Method) at org.apache.geode.internal.cache.PartitionedRegion.shadowPRWaitForBucketRecovery(PartitionedRegion.java:10057) at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue.addShadowPartitionedRegionForUserPR(ParallelGatewaySenderQueue.java:570) at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue.addShadowPartitionedRegionForUserPR(ParallelGatewaySenderQueue.java:447) at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue.<init>(ParallelGatewaySenderQueue.java:281) at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue.<init>(ParallelGatewaySenderQueue.java:250) at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderEventProcessor.initializeMessageQueue(ParallelGatewaySenderEventProcessor.java:80) at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderEventProcessor.<init>(ParallelGatewaySenderEventProcessor.java:65) at org.apache.geode.cache.wan.internal.parallel.RemoteParallelGatewaySenderEventProcessor.<init>(RemoteParallelGatewaySenderEventProcessor.java:39) at org.apache.geode.cache.wan.internal.parallel.RemoteConcurrentParallelGatewaySenderEventProcessor.createProcessors(RemoteConcurrentParallelGatewaySenderEventProcessor.java:50) at org.apache.geode.internal.cache.wan.parallel.ConcurrentParallelGatewaySenderEventProcessor.<init>(ConcurrentParallelGatewaySenderEventProcessor.java:102) at org.apache.geode.cache.wan.internal.parallel.RemoteConcurrentParallelGatewaySenderEventProcessor.<init>(RemoteConcurrentParallelGatewaySenderEventProcessor.java:38) at org.apache.geode.cache.wan.internal.parallel.ParallelGatewaySenderImpl.start(ParallelGatewaySenderImpl.java:86) at org.apache.geode.cache.wan.internal.parallel.ParallelGatewaySenderImpl.startWithCleanQueue(ParallelGatewaySenderImpl.java:61) at org.apache.geode.internal.cache.wan.WANTestBase.startSenderwithCleanQueues(WANTestBase.java:1134) at org.apache.geode.internal.cache.wan.WANTestBase.lambda$startSenderwithCleanQueuesInVMsAsync$1527b440$1(WANTestBase.java:1096) at org.apache.geode.internal.cache.wan.WANTestBase$$Lambda$456/976167913.run(Unknown Source) at org.apache.geode.test.dunit.internal.IdentifiableRunnable.run(IdentifiableRunnable.java:41) at sun.reflect.GeneratedMethodAccessor309.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123) at org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$35/872999436.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
It seems that the test got stuck in recovery while trying to restart the senders. This failure occured once during a mass test run.
The failing run can be found here:
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2486