Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-9887

Deadlock when shutting down gws threads unnecessarily delay shutdown of server for 15 seconds

    XMLWordPrintableJSON

Details

    Description

      See deadlock in below logs:

      1. "Distributed system shutdown hook" takes lock 0x00000000c445e988, initiate "ConcurrentParallelGatewaySenderEventProcessor Stopper Thread" threads and waits for them to finish.

      2. "ConcurrentParallelGatewaySenderEventProcessor Stopper Thread5" set flag AckReaderThread.shutdown to true and wait for shutdown to finish by joining threads for max 15 seconds.

      3. "AckReaderThread for : Event Processor for GatewaySender_sender1_4" thread waits for the lock 0x00000000c445e988 owned by "Distributed system shutdown hook"  thread

      This deadlock only last for 15 seconds, because thread join will expire for all "ConcurrentParallelGatewaySenderEventProcessor Stopper Thread" threads forcing them to finish. After these threads finish then "Distributed system shutdown hook" can continue the execution, release the lock and conclude the shutdown of the server.

       

      "Distributed system shutdown hook" #14 prio=5 os_prio=0 cpu=20.78ms elapsed=11.33s tid=0x00007f848c005000 nid=0x1e04 waiting on condition  [0x00007f83ec415000]
         java.lang.Thread.State: WAITING (parking)
              at jdk.internal.misc.Unsafe.park(java.base@11.0.13/Native Method)
              - parking to wait for  <0x00000000fcc00e50> (a java.util.concurrent.FutureTask)
              at java.util.concurrent.locks.LockSupport.park(java.base@11.0.13/LockSupport.java:194)
              at java.util.concurrent.FutureTask.awaitDone(java.base@11.0.13/FutureTask.java:447)
              at java.util.concurrent.FutureTask.get(java.base@11.0.13/FutureTask.java:190)
              at java.util.concurrent.AbstractExecutorService.invokeAll(java.base@11.0.13/AbstractExecutorService.java:247)
              at org.apache.geode.internal.cache.wan.parallel.ConcurrentParallelGatewaySenderEventProcessor.stopProcessing(ConcurrentParallelGatewaySenderEventProcessor.java:258)
              at org.apache.geode.internal.cache.wan.AbstractGatewaySender.stopProcessing(AbstractGatewaySender.java:726)
              at org.apache.geode.internal.cache.wan.parallel.ParallelGatewaySenderImpl.stop(ParallelGatewaySenderImpl.java:118)
              at org.apache.geode.internal.cache.GemFireCacheImpl.close(GemFireCacheImpl.java:2165)
              - locked <0x00000000c11a7400> (a java.lang.Class for org.apache.geode.internal.cache.GemFireCacheImpl)
              at org.apache.geode.distributed.internal.InternalDistributedSystem.disconnect(InternalDistributedSystem.java:1559)
              - locked <0x00000000c11a7400> (a java.lang.Class for org.apache.geode.internal.cache.GemFireCacheImpl)
              at org.apache.geode.distributed.internal.InternalDistributedSystem.lambda$static$7(InternalDistributedSystem.java:2202)
              at org.apache.geode.distributed.internal.InternalDistributedSystem$$Lambda$110/0x0000000100226840.run(Unknown Source)
              at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)
         Locked ownable synchronizers:
              - <0x00000000c445e988> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
      
      
      
      "AckReaderThread for : Event Processor for GatewaySender_sender1_4" #402 daemon prio=5 os_prio=0 cpu=3168.26ms elapsed=640.74s tid=0x00007f8434023000 nid=0x1181 waiting on condition  [0x00007f83eda2b000]
         java.lang.Thread.State: WAITING (parking)
          at jdk.internal.misc.Unsafe.park(java.base@11.0.13/Native Method)
          - parking to wait for  <0x00000000c445e988> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
          at java.util.concurrent.locks.LockSupport.park(java.base@11.0.13/LockSupport.java:194)
         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.13/AbstractQueuedSynchronizer.java:885)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.base@11.0.13/AbstractQueuedSynchronizer.java:917)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@11.0.13/AbstractQueuedSynchronizer.java:1240)
          at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(java.base@11.0.13/ReentrantReadWriteLock.java:959)
          at org.apache.geode.internal.cache.wan.GatewaySenderEventRemoteDispatcher$AckReaderThread.run(GatewaySenderEventRemoteDispatcher.java:665)
        Locked ownable synchronizers:
          - None
      
      
      
      
      "ConcurrentParallelGatewaySenderEventProcessor Stopper Thread5" #872 daemon prio=5 os_prio=0 cpu=1.39ms elapsed=14.09s tid=0x00007f849801a000 nid=0x1e13 in Object.wait()  [0x00007f849c442000]
         java.lang.Thread.State: TIMED_WAITING (on object monitor)
              at java.lang.Object.wait(java.base@11.0.13/Native Method)
              - waiting on <no object reference available>
              at java.lang.Thread.join(java.base@11.0.13/Thread.java:1308)
              - waiting to re-lock in wait() <0x00000000c542ce20> (a org.apache.geode.internal.cache.wan.GatewaySenderEventRemoteDispatcher$AckReaderThread)
              at org.apache.geode.internal.cache.wan.GatewaySenderEventRemoteDispatcher$AckReaderThread.shutdown(GatewaySenderEventRemoteDispatcher.java:771)
              at org.apache.geode.internal.cache.wan.GatewaySenderEventRemoteDispatcher.stopAckReaderThread(GatewaySenderEventRemoteDispatcher.java:802)
              at org.apache.geode.internal.cache.wan.GatewaySenderEventRemoteDispatcher.stop(GatewaySenderEventRemoteDispatcher.java:826)
              at org.apache.geode.internal.cache.wan.AbstractGatewaySenderEventProcessor.stopProcessing(AbstractGatewaySenderEventProcessor.java:1222)
              at org.apache.geode.internal.cache.wan.AbstractGatewaySenderEventProcessor$SenderStopperCallable.call(AbstractGatewaySenderEventProcessor.java:1399)
              at org.apache.geode.internal.cache.wan.AbstractGatewaySenderEventProcessor$SenderStopperCallable.call(AbstractGatewaySenderEventProcessor.java:1387)
              at java.util.concurrent.FutureTask.run(java.base@11.0.13/FutureTask.java:264)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.13/ThreadPoolExecutor.java:1128)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.13/ThreadPoolExecutor.java:628)
              at java.lang.Thread.run(java.base@11.0.13/Thread.java:829)   Locked ownable synchronizers:
              - <0x00000000fcf4daa8> (a java.util.concurrent.ThreadPoolExecutor$Worker)
      

       

      Attachments

        Activity

          People

            jvarenina Jakov Varenina
            jvarenina Jakov Varenina
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: