Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
When the broker hits a "critical" IO error it will shut itself down. However, during the shutdown process multiple notifications are sent. These notifications trigger disk IO which can delay (and potentially hang) shutdown. Here's an example from a thread-dump from a broker hung in shutdown:
"Thread-11" #73 prio=5 os_prio=0 tid=0x00007fa3fc002800 nid=0x1907 waiting on condition [0x00007fa48d60d000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000009b1055f0> (a java.util.concurrent.CountDownLatch$Sync) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) at org.apache.activemq.artemis.core.journal.impl.SimpleWaitIOCallback.waitCompletion(SimpleWaitIOCallback.java:61) at org.apache.activemq.artemis.core.journal.impl.JournalBase.appendAddRecord(JournalBase.java:52) at org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendAddRecord(JournalImpl.java:93) at org.apache.activemq.artemis.core.journal.Journal.appendAddRecord(Journal.java:65) at org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.storeID(AbstractJournalStorageManager.java:805) at org.apache.activemq.artemis.core.persistence.impl.journal.BatchingIDGenerator.storeID(BatchingIDGenerator.java:147) at org.apache.activemq.artemis.core.persistence.impl.journal.BatchingIDGenerator.saveCheckPoint(BatchingIDGenerator.java:132) - locked <0x0000000090f23850> (a org.apache.activemq.artemis.core.persistence.impl.journal.BatchingIDGenerator) at org.apache.activemq.artemis.core.persistence.impl.journal.BatchingIDGenerator.generateID(BatchingIDGenerator.java:111) at org.apache.activemq.artemis.core.persistence.impl.journal.AbstractJournalStorageManager.generateID(AbstractJournalStorageManager.java:334) at org.apache.activemq.artemis.core.server.management.impl.ManagementServiceImpl.sendNotification(ManagementServiceImpl.java:678) - locked <0x0000000090b0ab20> (a java.lang.Object) - locked <0x0000000090f21550> (a org.apache.activemq.artemis.core.server.management.impl.ManagementServiceImpl) at org.apache.activemq.artemis.core.server.cluster.impl.BroadcastGroupImpl.stop(BroadcastGroupImpl.java:142) - locked <0x0000000090b0b650> (a org.apache.activemq.artemis.core.server.cluster.impl.BroadcastGroupImpl) at org.apache.activemq.artemis.core.server.cluster.ClusterManager.stop(ClusterManager.java:310) - locked <0x0000000090b0b508> (a org.apache.activemq.artemis.core.server.cluster.ClusterManager) at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stopComponent(ActiveMQServerImpl.java:1355) at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1090) - locked <0x0000000090f1d128> (a org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl) at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1054) at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$5.run(ActiveMQServerImpl.java:860)