Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
qpid-java-broker-8.0.4
-
None
Description
Issue is similar to QPID-8489: on certain conditions SSL connection thread falls into an infinite loop consuming CPU.
Stacktrace from the threaddump:
"IO-/127.0.0.1:50414" #32 prio=5 os_prio=0 cpu=43504522.88ms elapsed=330519.20s tid=0x00007f47a0375000 nid=0x9089 runnable [0x00007f47956fe000] java.lang.Thread.State: RUNNABLE at sun.security.ssl.TransportContext.getHandshakeStatus(java.base@11.0.7/TransportContext.java:571) at sun.security.ssl.SSLEngineImpl.getHandshakeStatus(java.base@11.0.7/SSLEngineImpl.java:801) - locked <0x00007f4b721e7340> (a sun.security.ssl.SSLEngineImpl) at org.apache.qpid.server.transport.NonBlockingConnectionTLSDelegate.wrapBufferArray(NonBlockingConnectionTLSDelegate.java:239) at org.apache.qpid.server.transport.NonBlockingConnectionTLSDelegate.doWrite(NonBlockingConnectionTLSDelegate.java:164) at org.apache.qpid.server.transport.NonBlockingConnection.doWrite(NonBlockingConnection.java:521) at org.apache.qpid.server.transport.NonBlockingConnection.shutdownFinalWrite(NonBlockingConnection.java:422) at org.apache.qpid.server.transport.NonBlockingConnection.shutdown(NonBlockingConnection.java:384) at org.apache.qpid.server.transport.NonBlockingConnection.doWork(NonBlockingConnection.java:313) at org.apache.qpid.server.transport.NetworkConnectionScheduler.processConnection(NetworkConnectionScheduler.java:134) at org.apache.qpid.server.transport.SelectorThread$ConnectionProcessor.processConnection(SelectorThread.java:575) at org.apache.qpid.server.transport.SelectorThread$SelectionTask.performSelect(SelectorThread.java:366) at org.apache.qpid.server.transport.SelectorThread$SelectionTask.run(SelectorThread.java:97) at org.apache.qpid.server.transport.SelectorThread.run(SelectorThread.java:533) at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.7/ThreadPoolExecutor.java:1128) at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.7/ThreadPoolExecutor.java:628) at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory.lambda$null$0(QpidByteBufferFactory.java:464) at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory$$Lambda$64/0x00007f47d52bcc40.run(Unknown Source) at java.lang.Thread.run(java.base@11.0.7/Thread.java:834)
Stacktrace 2 from the threaddump:
"IO-/127.0.0.1:50414" #32 prio=5 os_prio=0 cpu=43536330.84ms elapsed=330555.26s tid=0x00007f47a0375000 nid=0x9089 runnable [0x00007f47956fe000] java.lang.Thread.State: RUNNABLE at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory.encryptSSL(QpidByteBufferFactory.java:178) at org.apache.qpid.server.bytebuffer.QpidByteBuffer.encryptSSL(QpidByteBuffer.java:62) at org.apache.qpid.server.transport.NonBlockingConnectionTLSDelegate.wrapBufferArray(NonBlockingConnectionTLSDelegate.java:255) at org.apache.qpid.server.transport.NonBlockingConnectionTLSDelegate.doWrite(NonBlockingConnectionTLSDelegate.java:164) at org.apache.qpid.server.transport.NonBlockingConnection.doWrite(NonBlockingConnection.java:521) at org.apache.qpid.server.transport.NonBlockingConnection.shutdownFinalWrite(NonBlockingConnection.java:422) at org.apache.qpid.server.transport.NonBlockingConnection.shutdown(NonBlockingConnection.java:384) at org.apache.qpid.server.transport.NonBlockingConnection.doWork(NonBlockingConnection.java:313) at org.apache.qpid.server.transport.NetworkConnectionScheduler.processConnection(NetworkConnectionScheduler.java:134) at org.apache.qpid.server.transport.SelectorThread$ConnectionProcessor.processConnection(SelectorThread.java:575) at org.apache.qpid.server.transport.SelectorThread$SelectionTask.performSelect(SelectorThread.java:366) at org.apache.qpid.server.transport.SelectorThread$SelectionTask.run(SelectorThread.java:97) at org.apache.qpid.server.transport.SelectorThread.run(SelectorThread.java:533) at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.7/ThreadPoolExecutor.java:1128) at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.7/ThreadPoolExecutor.java:628) at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory.lambda$null$0(QpidByteBufferFactory.java:464) at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory$$Lambda$64/0x00007f47d52bcc40.run(Unknown Source) at java.lang.Thread.run(java.base@11.0.7/Thread.java:834)
Stucked connection details (retrieved via broker REST API):
{ "id" : "60b493c2-de91-4690-8e4c-1e3d5d5ea353", "name" : "[8332442] 127.0.0.1:64711", "type" : "AMQP_1_0", "desiredState" : "ACTIVE", "state" : "ACTIVE", "durable" : false, "lifetimePolicy" : "PERMANENT", "incoming" : true, "incomingIdleTimeout" : 60000, "lastOpenedTime" : 1618942801846, "maxUncommittedInMemorySize" : 10485760, "outgoingIdleTimeout" : 0, "port" : "amqps", "protocol" : "AMQP_1_0", "remoteAddress" : "/127.0.0.1:64711", "sendSaslFinalChallengeAsChallenge" : false, "sessionCountLimit" : 1, "transport" : "SSL", "transportInfo" : "TLSv1.2 ; TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256", "lastUpdatedTime" : 1618942801846, "createdTime" : 1618942801846, "statistics" : { "bytesIn" : 0, "bytesOut" : 0, "lastInboundMessageTime" : 1618942801846, "lastIoTime" : 1618942801846, "lastMessageTime" : 1618942801846, "lastOutboundMessageTime" : 1618942801846, "localTransactionBegins" : 0, "localTransactionOpen" : 0, "localTransactionRollbacks" : 0, "messagesIn" : 0, "messagesOut" : 0, "sessionCount" : 0, "transactedMessagesIn" : 0, "transactedMessagesOut" : 0 } }
It seems that broker tries to close SSL connection and loops inside NonBlockingConnection.shutdownFinalWrite() continuously calling NonBlockingConnection.doWrite() -> NonBlockingConnectionTLSDelegate.doWrite() -> NonBlockingConnectionTLSDelegate.wrapBufferArray().
After adding circuit breaker and explicit logging in methods NonBlockingConnection.doWrite(), NonBlockingConnectionTLSDelegate.doWrite() and NonBlockingConnectionTLSDelegate.wrapBufferArray() following logs were retrieved:
2021-05-10 06:30:54,637 INFO [Broker-Config] (q.m.c.open) - [con:9,435,937(/127.0.0.1:52648)] CON-1001 : Open : Destination : amqps(90.163.253.5:10170) : Protocol Version : 1.0 : SSL 2021-05-10 06:30:54,688 INFO [IO-/127.0.0.1:52648] (q.m.c.open) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)] CON-1001 : Open : Destination : amqps(90.163.253.5:10170) : Protocol Version : 1.0 : SSL : Client ID : Exxeta-299e32f1-1d0a-4db6-aeb8-1822411b7fc6 : Client Version : 0.50.0 : Client Product : QpidJMS 2021-05-10 06:30:54,689 INFO [IO-/127.0.0.1:52648] (q.m.a.allowed) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)] ACL-1001 : Allowed : Access Virtualhost NAME=default,VIRTUALHOST_NAME=default 2021-05-10 06:30:54,709 INFO [IO-/127.0.0.1:52648] (q.m.c.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:0] CHN-1001 : Create 2021-05-10 06:30:54,739 INFO [IO-/127.0.0.1:52648] (q.m.c.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] CHN-1001 : Create 2021-05-10 06:30:54,771 INFO [VirtualHostNode-default-Config] (q.m.a.allowed) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] ACL-1001 : Allowed : Consume Queue NAME=broadcast.VTVTV_YYYYXXXXXZZZSSS.TradeConfirmation,DURABLE=true,EXCLUSIVE=false,TEMPORARY=false,AUTO_DELETE=false,VIRTUALHOST_NAME=default,CREATED_BY=admin 2021-05-10 06:30:54,771 INFO [VirtualHostNode-default-Config] (q.m.s.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,978(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.TradeConfirmation)] SUB-1001 : Create 2021-05-10 06:30:54,799 INFO [VirtualHostNode-default-Config] (q.m.a.allowed) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] ACL-1001 : Allowed : Consume Queue NAME=broadcast.VTVTV_YYYYXXXXXZZZSSS.Workflow,DURABLE=true,EXCLUSIVE=false,TEMPORARY=false,AUTO_DELETE=false,VIRTUALHOST_NAME=default,CREATED_BY=admin 2021-05-10 06:30:54,799 INFO [VirtualHostNode-default-Config] (q.m.s.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,979(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.Workflow)] SUB-1001 : Create 2021-05-10 06:30:54,816 INFO [VirtualHostNode-default-Config] (q.m.a.allowed) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] ACL-1001 : Allowed : Consume Queue NAME=broadcast.VTVTV_YYYYXXXXXZZZSSS.Public,DURABLE=true,EXCLUSIVE=false,TEMPORARY=false,AUTO_DELETE=false,VIRTUALHOST_NAME=default,CREATED_BY=admin 2021-05-10 06:30:54,816 INFO [VirtualHostNode-default-Config] (q.m.s.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,980(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.Public)] SUB-1001 : Create 2021-05-10 06:30:54,834 INFO [VirtualHostNode-default-Config] (q.m.a.allowed) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] ACL-1001 : Allowed : Consume Queue NAME=response.VTVTV_YYYYXXXXXZZZSSS,DURABLE=true,EXCLUSIVE=false,TEMPORARY=false,AUTO_DELETE=false,VIRTUALHOST_NAME=default,CREATED_BY=admin 2021-05-10 06:30:54,834 INFO [VirtualHostNode-default-Config] (q.m.s.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,981(vh(/default)/qu(response.VTVTV_YYYYXXXXXZZZSSS)] SUB-1001 : Create 2021-05-10 06:30:54,856 INFO [IO-/127.0.0.1:52648] (q.m.s.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] SND-1001 : Create : qpid-jms:sender:ID:e72a83ab-2332-4e64-b891-0bf26cb8f632:1:1:1:request.VTVTV_YYYYXXXXXZZZSSS : request.VTVTV_YYYYXXXXXZZZSSS 2021-05-10 10:01:40,576 INFO [IO-/127.0.0.1:52648] (q.m.c.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:0] [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:0] CHN-1003 : Close 2021-05-10 10:01:40,576 INFO [VirtualHostNode-default-Config] (q.m.s.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,978(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.TradeConfirmation)] SUB-1002 : Close 2021-05-10 10:01:40,576 INFO [IO-/127.0.0.1:52648] (q.m.s.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] SND-1002 : Close : qpid-jms:sender:ID:e72a83ab-2332-4e64-b891-0bf26cb8f632:1:1:1:request.VTVTV_YYYYXXXXXZZZSSS : request.VTVTV_YYYYXXXXXZZZSSS 2021-05-10 10:01:40,577 INFO [VirtualHostNode-default-Config] (q.m.s.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,980(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.Public)] SUB-1002 : Close 2021-05-10 10:01:40,578 INFO [VirtualHostNode-default-Config] (q.m.s.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,981(vh(/default)/qu(response.VTVTV_YYYYXXXXXZZZSSS)] SUB-1002 : Close 2021-05-10 10:01:40,578 INFO [VirtualHostNode-default-Config] (q.m.s.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,979(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.Workflow)] SUB-1002 : Close 2021-05-10 10:01:40,579 INFO [IO-/127.0.0.1:52648] (q.m.c.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] CHN-1003 : Close 2021-05-10 10:01:40,579 INFO [IO-/127.0.0.1:52648] (o.a.q.s.p.v.AMQPConnection_1_0Impl) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)] : Connection closed with error : amqp:resource-limit-exceeded - local-idle-timeout expired 2021-05-10 10:01:40,644 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - bufCount: 2 2021-05-10 10:01:40,644 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - _status: [ Status = CLOSED, HandshakeStatus = NOT_HANDSHAKING, bytesConsumed = 0, bytesProduced = 0 ], _sslEngine.isOutboundDone(): true, _sslEngine.isInboundDone(): true, _sslEngine.getPeerHost(): null, _sslEngine.getPeerPort(): -1, buffers.size(): 2, _encryptedOutput.size(): 0, _netOutputBuffer.remaining(): 152185, _sslEngine.getSession().getPacketBufferSize(): 16709, _sslEngine.getSession().getCreationTime(): 1620621054549, _sslEngine.getSession().getLastAccessedTime(): 1620621054549, _sslEngine.getHandshakeSession().isValid(): null handshake session 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - buffers.size(): 2, bufsSent: false, bytesWritten: -1, _encryptedOutput: 0 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnection) - Looping detected: false, 0, 2 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - bufCount: 2 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - _status: [ Status = CLOSED, HandshakeStatus = NOT_HANDSHAKING, bytesConsumed = 0, bytesProduced = 0 ], _sslEngine.isOutboundDone(): true, _sslEngine.isInboundDone(): true, _sslEngine.getPeerHost(): null, _sslEngine.getPeerPort(): -1, buffers.size(): 2, _encryptedOutput.size(): 0, _netOutputBuffer.remaining(): 152185, _sslEngine.getSession().getPacketBufferSize(): 16709, _sslEngine.getSession().getCreationTime(): 1620621054549, _sslEngine.getSession().getLastAccessedTime(): 1620621054549, _sslEngine.getHandshakeSession().isValid(): null handshake session 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - buffers.size(): 2, bufsSent: false, bytesWritten: -1, _encryptedOutput: 0 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnection) - Looping detected: false, 0, 2 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - bufCount: 2 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - _status: [ Status = CLOSED, HandshakeStatus = NOT_HANDSHAKING, bytesConsumed = 0, bytesProduced = 0 ], _sslEngine.isOutboundDone(): true, _sslEngine.isInboundDone(): true, _sslEngine.getPeerHost(): null, _sslEngine.getPeerPort(): -1, buffers.size(): 2, _encryptedOutput.size(): 0, _netOutputBuffer.remaining(): 152185, _sslEngine.getSession().getPacketBufferSize(): 16709, _sslEngine.getSession().getCreationTime(): 1620621054549, _sslEngine.getSession().getLastAccessedTime(): 1620621054549, _sslEngine.getHandshakeSession().isValid(): null handshake session 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - buffers.size(): 2, bufsSent: false, bytesWritten: -1, _encryptedOutput: 0 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnection) - Looping detected: false, 0, 2 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - bufCount: 2 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - _status: [ Status = CLOSED, HandshakeStatus = NOT_HANDSHAKING, bytesConsumed = 0, bytesProduced = 0 ], _sslEngine.isOutboundDone(): true, _sslEngine.isInboundDone(): true, _sslEngine.getPeerHost(): null, _sslEngine.getPeerPort(): -1, buffers.size(): 2, _encryptedOutput.size(): 0, _netOutputBuffer.remaining(): 152185, _sslEngine.getSession().getPacketBufferSize(): 16709, _sslEngine.getSession().getCreationTime(): 1620621054549, _sslEngine.getSession().getLastAccessedTime(): 1620621054549, _sslEngine.getHandshakeSession().isValid(): null handshake session 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - buffers.size(): 2, bufsSent: false, bytesWritten: -1, _encryptedOutput: 0 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnection) - Looping detected: false, 0, 2 2021-05-10 10:01:40,645 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - bufCount: 2 2021-05-10 10:01:40,646 WARN [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - _status: [ Status = CLOSED, HandshakeStatus = NOT_HANDSHAKING, bytesConsumed = 0, bytesProduced = 0 ], _sslEngine.isOutboundDone(): true, _sslEngine.isInboundDone(): true, _sslEngine.getPeerHost(): null, _sslEngine.getPeerPort(): -1, buffers.size(): 2, _encryptedOutput.size(): 0, _netOutputBuffer.remaining(): 152185, _sslEngine.getSession().getPacketBufferSize(): 16709, _sslEngine.getSession().getCreationTime(): 1620621054549, _sslEngine.getSession().getLastAccessedTime(): 1620621054549, _sslEngine.getHandshakeSession().isValid(): null handshake session 2021-05-10 10:01:40,646 INFO [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnection) - Exception performing final write/close for '/127.0.0.1:52648': Looping detected, executing circuit breaker 2021-05-10 10:01:40,647 INFO [Broker-Config] (q.m.c.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@null/default)] CON-1002 : Close
Noticeable is SSLEngineResult.Status.CLOSED as well as SSLEngineResult.HandshakeStatus NOT_HANDSHAKING. Combination of those statuses indicates that the last operation closed broker side of the SSLEngine, or the operation could not be completed because it was already closed.