Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.10.0
-
None
-
None
Description
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/UpgradeTestOpenJDK8/builds/217#A run timed out.
The dunit-hangs.txt shows two tests not finished, and I think the ClientServerMiscBCDUnitTest giiEventQueueFromOldToCurrentMemberShouldSucceed is not really hang. It is just a little hard to read the stack dumps as they are for both test runs.
Started @ 2020-05-29 14:11:25.687 +0000
2020-05-29 14:59:03.911 +0000 org.apache.geode.internal.cache.tier.sockets.ClientServerMiscBCDUnitTest giiEventQueueFromOldToCurrentMemberShouldSucceed[12]
Ended @ 2020-05-29 14:59:03.911 +0000
Started @ 2020-05-29 14:11:25.635 +0000
2020-05-29 14:46:47.615 +0000 org.apache.geode.cache.lucene.RollingUpgradeQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion luceneQueryReturnsCorrectResultsAfterServersRollOverOnPartitionRegion[from_v1.9.2,
Ended @ 2020-05-29 14:56:25.635 +0000
There are multiple DLock requesters for LockGrantor during PartitionedRepositoryManager.computeRepository().
"Pooled Waiting Message Processor 40" #163 daemon prio=5 os_prio=0 tid=0x00007fc94c043000 nid=0x955c waiting on condition [0x00007fc9d07ad000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000e351c5c0> (a java.util.concurrent.CountDownLatch$Sync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) at org.apache.geode.internal.util.concurrent.FutureResult.get(FutureResult.java:69) at org.apache.geode.distributed.internal.locks.DLockService.waitForLockGrantorFutureResult(DLockService.java:750) at org.apache.geode.distributed.internal.locks.DLockService.getLockGrantorId(DLockService.java:301) at org.apache.geode.distributed.internal.locks.DLockService.lockInterruptibly(DLockService.java:1445) at org.apache.geode.distributed.internal.locks.DLockService.lock(DLockService.java:1241) at org.apache.geode.distributed.internal.locks.DLockService.lock(DLockService.java:1232) at org.apache.geode.distributed.internal.locks.DLockService.lock(DLockService.java:1227) at org.apache.geode.distributed.internal.locks.DLockService.lock(DLockService.java:1223) at org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.finishComputingRepository(IndexRepositoryFactory.java:105) at org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.computeIndexRepository(IndexRepositoryFactory.java:65) at org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager.computeRepository(PartitionedRepositoryManager.java:151) at org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager.lambda$computeRepository$1(PartitionedRepositoryManager.java:170) at org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager$$Lambda$308/437574105.apply(Unknown Source) at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) - locked <0x00000000e35149f0> (a java.util.concurrent.ConcurrentHashMap$ReservationNode) at org.apache.geode.cache.lucene.internal.PartitionedRepositoryManager.computeRepository(PartitionedRepositoryManager.java:162) at org.apache.geode.cache.lucene.internal.LuceneBucketListener.lambda$afterPrimary$0(LuceneBucketListener.java:40) at org.apache.geode.cache.lucene.internal.LuceneBucketListener$$Lambda$307/1978619753.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:442) at org.apache.geode.distributed.internal.ClusterOperationExecutors.doWaitingThread(ClusterOperationExecutors.java:411) at org.apache.geode.distributed.internal.ClusterOperationExecutors$$Lambda$86/1114445410.invoke(Unknown Source) at org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119) at org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$82/1488044985.run(Unknown Source) at java.lang.Thread.run(Thread.java:748) Locked ownable synchronizers: - <0x00000000e3596b58> (a java.util.concurrent.ThreadPoolExecutor$Worker)
The remote process that should processing the requests are being closed and is wait for GrantorRequestProcessor.clearGrantor to finish:
"RMI TCP Connection(1)-172.17.0.30" #32 daemon prio=5 os_prio=0 tid=0x00007f8178001800 nid=0x939b waiting on condition [0x00007f826f269000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000e1982990> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247) at java.util.concurrent.locks.ReentrantLock.tryLock(ReentrantLock.java:442) at org.apache.geode.internal.util.concurrent.StoppableReentrantLock.lockInterruptibly(StoppableReentrantLock.java:85) at org.apache.geode.internal.util.concurrent.StoppableReentrantLock.lock(StoppableReentrantLock.java:71) at org.apache.geode.distributed.internal.ClusterElderManager.initializeElderState(ClusterElderManager.java:102) at org.apache.geode.distributed.internal.ClusterElderManager.getElderState(ClusterElderManager.java:97) at org.apache.geode.distributed.internal.ClusterDistributionManager.getElderState(ClusterDistributionManager.java:2091) at org.apache.geode.distributed.internal.locks.GrantorRequestProcessor.startElderCall(GrantorRequestProcessor.java:227) at org.apache.geode.distributed.internal.locks.GrantorRequestProcessor.basicOp(GrantorRequestProcessor.java:336) at org.apache.geode.distributed.internal.locks.GrantorRequestProcessor.basicOp(GrantorRequestProcessor.java:319) at org.apache.geode.distributed.internal.locks.GrantorRequestProcessor.clearGrantor(GrantorRequestProcessor.java:310) at org.apache.geode.distributed.internal.locks.DLockService.clearGrantor(DLockService.java:874) at org.apache.geode.distributed.internal.locks.DLockGrantor.destroy(DLockGrantor.java:1249) - locked <0x00000000fbd65f58> (a org.apache.geode.distributed.internal.locks.DLockGrantor) at org.apache.geode.distributed.internal.locks.DLockService.nullLockGrantorId(DLockService.java:649) at org.apache.geode.distributed.internal.locks.DLockService.basicDestroy(DLockService.java:2336) at org.apache.geode.distributed.internal.locks.DLockService.destroyAndRemove(DLockService.java:2256) - locked <0x00000000fbc85130> (a java.lang.Object) at org.apache.geode.distributed.internal.locks.DLockService.destroyServiceNamed(DLockService.java:2195) at org.apache.geode.distributed.DistributedLockService.destroy(DistributedLockService.java:84) at org.apache.geode.internal.cache.GemFireCacheImpl.destroyPartitionedRegionLockService(GemFireCacheImpl.java:2042) at org.apache.geode.internal.cache.GemFireCacheImpl.doClose(GemFireCacheImpl.java:2296) - locked <0x00000000e2282030> (a java.lang.Class for org.apache.geode.internal.cache.GemFireCacheImpl) at org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2117) at org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1985) at org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:1975) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeTestBase.closeCache(LuceneSearchWithRollingUpgradeTestBase.java:755) at org.apache.geode.cache.lucene.LuceneSearchWithRollingUpgradeTestBase$12.run2(LuceneSearchWithRollingUpgradeTestBase.java:568) at org.apache.geode.cache30.CacheSerializableRunnable.run(CacheSerializableRunnable.java:55) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123) at org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78) at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$15/1032579838.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Locked ownable synchronizers: - <0x00000000fb3d0db0> (a java.util.concurrent.ThreadPoolExecutor$Worker) - <0x00000000fbec63a8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
The thread holding the lock is still in ElderInitProcessor.init. I can not tell why it is not able to finish it.
"Pooled Message Processor 2" #124 daemon prio=10 os_prio=0 tid=0x00007f8100034000 nid=0x9401 waiting on condition [0x00007f82648cc000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000fb403e98> (a java.util.concurrent.CountDownLatch$Sync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72) at org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:731) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:802) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:779) at org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:865) at org.apache.geode.distributed.internal.locks.ElderInitProcessor.init(ElderInitProcessor.java:71) at org.apache.geode.distributed.internal.locks.ElderState.<init>(ElderState.java:53) at org.apache.geode.distributed.internal.ClusterElderManager.lambda$new$0(ClusterElderManager.java:41) at org.apache.geode.distributed.internal.ClusterElderManager$$Lambda$76/1497552839.get(Unknown Source) at org.apache.geode.distributed.internal.ClusterElderManager.initializeElderState(ClusterElderManager.java:106) at org.apache.geode.distributed.internal.ClusterElderManager.getElderState(ClusterElderManager.java:97) at org.apache.geode.distributed.internal.ClusterDistributionManager.getElderState(ClusterDistributionManager.java:2091) at org.apache.geode.distributed.internal.locks.GrantorRequestProcessor$GrantorRequestMessage.basicProcess(GrantorRequestProcessor.java:504) at org.apache.geode.distributed.internal.locks.GrantorRequestProcessor$GrantorRequestMessage.process(GrantorRequestProcessor.java:497) at org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:376) at org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:441) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:442) at org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:389) at org.apache.geode.distributed.internal.ClusterOperationExecutors$$Lambda$81/548873202.invoke(Unknown Source) at org.apache.geode.logging.internal.executors.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119) at org.apache.geode.logging.internal.executors.LoggingThreadFactory$$Lambda$82/1455748764.run(Unknown Source) at java.lang.Thread.run(Thread.java:748) Locked ownable synchronizers: - <0x00000000fb437ee0> (a java.util.concurrent.ThreadPoolExecutor$Worker) - <0x00000000e1982990> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)