Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-9790

CI Failure: ParallelWANPersistenceEnabledGatewaySenderDUnitTest. testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived fails due to TimeoutException while restarting senders

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • wan
    • None

    Description

      We've seen this test fail in CI with the following stack trace:

      ParallelWANPersistenceEnabledGatewaySenderDUnitTest > testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived FAILED
          java.lang.AssertionError: java.util.concurrent.TimeoutException: Timed out waiting 300000 milliseconds for AsyncInvocation to complete.
              at org.apache.geode.test.dunit.AsyncInvocation.await(AsyncInvocation.java:180)
              at org.apache.geode.internal.cache.wan.WANTestBase.startSenderwithCleanQueuesInVMsAsync(WANTestBase.java:1100)
              at org.apache.geode.internal.cache.wan.parallel.ParallelWANPersistenceEnabledGatewaySenderDUnitTest.testpersistentWanGateway_restartSenderWithCleanQueues_expectNoEventsReceived(ParallelWANPersistenceEnabledGatewaySenderDUnitTest.java:1780)
      
              Caused by:
              java.util.concurrent.TimeoutException: Timed out waiting 300000 milliseconds for AsyncInvocation to complete.
                  at org.apache.geode.test.dunit.AsyncInvocation.timeoutIfAlive(AsyncInvocation.java:509)
                  at org.apache.geode.test.dunit.AsyncInvocation.await(AsyncInvocation.java:447)
                  at org.apache.geode.test.dunit.AsyncInvocation.await(AsyncInvocation.java:178)
                  ... 2 more
      
                  Caused by:
                  org.apache.geode.test.dunit.internal.StackTrace: Stack trace for vm-4 thread-33
                      at java.lang.Thread.sleep(Native Method)
                      at org.apache.geode.internal.cache.PartitionedRegion.shadowPRWaitForBucketRecovery(PartitionedRegion.java:10057)
                      at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue.addShadowPartitionedRegionForUserPR(ParallelGatewaySenderQueue.java:570)
                      at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue.addShadowPartitionedRegionForUserPR(ParallelGatewaySenderQueue.java:447)
                      at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue.<init>(ParallelGatewaySenderQueue.java:281)
                      at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderQueue.<init>(ParallelGatewaySenderQueue.java:250)
                      at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderEventProcessor.initializeMessageQueue(ParallelGatewaySenderEventProcessor.java:80)
                      at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderEventProcessor.<init>(ParallelGatewaySenderEventProcessor.java:65)
                      at org.apache.geode.cache.wan.internal.parallel.RemoteParallelGatewaySenderEventProcessor.<init>(RemoteParallelGatewaySenderEventProcessor.java:39)
                      at org.apache.geode.cache.wan.internal.parallel.RemoteConcurrentParallelGatewaySenderEventProcessor.createProcessors(RemoteConcurrentParallelGatewaySenderEventProcessor.java:50)
                      at org.apache.geode.internal.cache.wan.parallel.ConcurrentParallelGatewaySenderEventProcessor.<init>(ConcurrentParallelGatewaySenderEventProcessor.java:102)
                      at org.apache.geode.cache.wan.internal.parallel.RemoteConcurrentParallelGatewaySenderEventProcessor.<init>(RemoteConcurrentParallelGatewaySenderEventProcessor.java:38)
                      at org.apache.geode.cache.wan.internal.parallel.ParallelGatewaySenderImpl.start(ParallelGatewaySenderImpl.java:86)
                      at org.apache.geode.cache.wan.internal.parallel.ParallelGatewaySenderImpl.startWithCleanQueue(ParallelGatewaySenderImpl.java:61)
                      at org.apache.geode.internal.cache.wan.WANTestBase.startSenderwithCleanQueues(WANTestBase.java:1134)
                      at org.apache.geode.internal.cache.wan.WANTestBase.lambda$startSenderwithCleanQueuesInVMsAsync$1527b440$1(WANTestBase.java:1096)
                      at org.apache.geode.internal.cache.wan.WANTestBase$$Lambda$456/976167913.run(Unknown Source)
                      at org.apache.geode.test.dunit.internal.IdentifiableRunnable.run(IdentifiableRunnable.java:41)
                      at sun.reflect.GeneratedMethodAccessor309.invoke(Unknown Source)
                      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                      at java.lang.reflect.Method.invoke(Method.java:498)
                      at org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
                      at org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
                      at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
                      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                      at java.lang.reflect.Method.invoke(Method.java:498)
                      at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
                      at sun.rmi.transport.Transport$1.run(Transport.java:200)
                      at sun.rmi.transport.Transport$1.run(Transport.java:197)
                      at java.security.AccessController.doPrivileged(Native Method)
                      at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
                      at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
                      at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
                      at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
                      at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$35/872999436.run(Unknown Source)
                      at java.security.AccessController.doPrivileged(Native Method)
                      at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
                      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                      at java.lang.Thread.run(Thread.java:748)
      

      It seems that the test got stuck in recovery while trying to restart the senders. This failure occured once during a mass test run.

      The failing run can be found here:
      https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-mass-test-run/jobs/distributed-test-openjdk8/builds/2486

      Attachments

        Activity

          People

            Unassigned Unassigned
            Sarm Kahel Benjamin P Ross
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: