Uploaded image for project: 'Qpid'
  1. Qpid
  2. QPID-8526

[Broker-J] Connection looping in NonBlockingConnectionTLSDelegate.doWrite()

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • qpid-java-broker-8.0.4
    • qpid-java-broker-8.0.5
    • Broker-J
    • None

    Description

      Issue is similar to QPID-8489: on certain conditions SSL connection thread falls into an infinite loop consuming CPU.

      Stacktrace from the threaddump:

      "IO-/127.0.0.1:50414" #32 prio=5 os_prio=0 cpu=43504522.88ms elapsed=330519.20s tid=0x00007f47a0375000 nid=0x9089 runnable  [0x00007f47956fe000]
         java.lang.Thread.State: RUNNABLE
      	at sun.security.ssl.TransportContext.getHandshakeStatus(java.base@11.0.7/TransportContext.java:571)
      	at sun.security.ssl.SSLEngineImpl.getHandshakeStatus(java.base@11.0.7/SSLEngineImpl.java:801)
      	- locked <0x00007f4b721e7340> (a sun.security.ssl.SSLEngineImpl)
      	at org.apache.qpid.server.transport.NonBlockingConnectionTLSDelegate.wrapBufferArray(NonBlockingConnectionTLSDelegate.java:239)
      	at org.apache.qpid.server.transport.NonBlockingConnectionTLSDelegate.doWrite(NonBlockingConnectionTLSDelegate.java:164)
      	at org.apache.qpid.server.transport.NonBlockingConnection.doWrite(NonBlockingConnection.java:521)
      	at org.apache.qpid.server.transport.NonBlockingConnection.shutdownFinalWrite(NonBlockingConnection.java:422)
      	at org.apache.qpid.server.transport.NonBlockingConnection.shutdown(NonBlockingConnection.java:384)
      	at org.apache.qpid.server.transport.NonBlockingConnection.doWork(NonBlockingConnection.java:313)
      	at org.apache.qpid.server.transport.NetworkConnectionScheduler.processConnection(NetworkConnectionScheduler.java:134)
      	at org.apache.qpid.server.transport.SelectorThread$ConnectionProcessor.processConnection(SelectorThread.java:575)
      	at org.apache.qpid.server.transport.SelectorThread$SelectionTask.performSelect(SelectorThread.java:366)
      	at org.apache.qpid.server.transport.SelectorThread$SelectionTask.run(SelectorThread.java:97)
      	at org.apache.qpid.server.transport.SelectorThread.run(SelectorThread.java:533)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.7/ThreadPoolExecutor.java:1128)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.7/ThreadPoolExecutor.java:628)
      	at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory.lambda$null$0(QpidByteBufferFactory.java:464)
      	at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory$$Lambda$64/0x00007f47d52bcc40.run(Unknown Source)
      	at java.lang.Thread.run(java.base@11.0.7/Thread.java:834)
      

      Stacktrace 2 from the threaddump:

      "IO-/127.0.0.1:50414" #32 prio=5 os_prio=0 cpu=43536330.84ms elapsed=330555.26s tid=0x00007f47a0375000 nid=0x9089 runnable  [0x00007f47956fe000]
         java.lang.Thread.State: RUNNABLE
      	at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory.encryptSSL(QpidByteBufferFactory.java:178)
      	at org.apache.qpid.server.bytebuffer.QpidByteBuffer.encryptSSL(QpidByteBuffer.java:62)
      	at org.apache.qpid.server.transport.NonBlockingConnectionTLSDelegate.wrapBufferArray(NonBlockingConnectionTLSDelegate.java:255)
      	at org.apache.qpid.server.transport.NonBlockingConnectionTLSDelegate.doWrite(NonBlockingConnectionTLSDelegate.java:164)
      	at org.apache.qpid.server.transport.NonBlockingConnection.doWrite(NonBlockingConnection.java:521)
      	at org.apache.qpid.server.transport.NonBlockingConnection.shutdownFinalWrite(NonBlockingConnection.java:422)
      	at org.apache.qpid.server.transport.NonBlockingConnection.shutdown(NonBlockingConnection.java:384)
      	at org.apache.qpid.server.transport.NonBlockingConnection.doWork(NonBlockingConnection.java:313)
      	at org.apache.qpid.server.transport.NetworkConnectionScheduler.processConnection(NetworkConnectionScheduler.java:134)
      	at org.apache.qpid.server.transport.SelectorThread$ConnectionProcessor.processConnection(SelectorThread.java:575)
      	at org.apache.qpid.server.transport.SelectorThread$SelectionTask.performSelect(SelectorThread.java:366)
      	at org.apache.qpid.server.transport.SelectorThread$SelectionTask.run(SelectorThread.java:97)
      	at org.apache.qpid.server.transport.SelectorThread.run(SelectorThread.java:533)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.7/ThreadPoolExecutor.java:1128)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.7/ThreadPoolExecutor.java:628)
      	at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory.lambda$null$0(QpidByteBufferFactory.java:464)
      	at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory$$Lambda$64/0x00007f47d52bcc40.run(Unknown Source)
      	at java.lang.Thread.run(java.base@11.0.7/Thread.java:834)
      

      Stucked connection details (retrieved via broker REST API):

      {
        "id" : "60b493c2-de91-4690-8e4c-1e3d5d5ea353",
        "name" : "[8332442] 127.0.0.1:64711",
        "type" : "AMQP_1_0",
        "desiredState" : "ACTIVE",
        "state" : "ACTIVE",
        "durable" : false,
        "lifetimePolicy" : "PERMANENT",
        "incoming" : true,
        "incomingIdleTimeout" : 60000,
        "lastOpenedTime" : 1618942801846,
        "maxUncommittedInMemorySize" : 10485760,
        "outgoingIdleTimeout" : 0,
        "port" : "amqps",
        "protocol" : "AMQP_1_0",
        "remoteAddress" : "/127.0.0.1:64711",
        "sendSaslFinalChallengeAsChallenge" : false,
        "sessionCountLimit" : 1,
        "transport" : "SSL",
        "transportInfo" : "TLSv1.2 ; TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256",
        "lastUpdatedTime" : 1618942801846,
        "createdTime" : 1618942801846,
        "statistics" : {
          "bytesIn" : 0,
          "bytesOut" : 0,
          "lastInboundMessageTime" : 1618942801846,
          "lastIoTime" : 1618942801846,
          "lastMessageTime" : 1618942801846,
          "lastOutboundMessageTime" : 1618942801846,
          "localTransactionBegins" : 0,
          "localTransactionOpen" : 0,
          "localTransactionRollbacks" : 0,
          "messagesIn" : 0,
          "messagesOut" : 0,
          "sessionCount" : 0,
          "transactedMessagesIn" : 0,
          "transactedMessagesOut" : 0
        }
      }
      

      It seems that broker tries to close SSL connection and loops inside NonBlockingConnection.shutdownFinalWrite() continuously calling NonBlockingConnection.doWrite() -> NonBlockingConnectionTLSDelegate.doWrite() -> NonBlockingConnectionTLSDelegate.wrapBufferArray(). 

      After adding circuit breaker and explicit logging in methods NonBlockingConnection.doWrite(), NonBlockingConnectionTLSDelegate.doWrite() and NonBlockingConnectionTLSDelegate.wrapBufferArray() following logs were retrieved:

      2021-05-10 06:30:54,637 INFO  [Broker-Config] (q.m.c.open) - [con:9,435,937(/127.0.0.1:52648)] CON-1001 : Open : Destination : amqps(90.163.253.5:10170) : Protocol Version : 1.0 : SSL
      2021-05-10 06:30:54,688 INFO  [IO-/127.0.0.1:52648] (q.m.c.open) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)] CON-1001 : Open : Destination : amqps(90.163.253.5:10170) : Protocol Version : 1.0 : SSL : Client ID : Exxeta-299e32f1-1d0a-4db6-aeb8-1822411b7fc6 : Client Version : 0.50.0 : Client Product : QpidJMS
      2021-05-10 06:30:54,689 INFO  [IO-/127.0.0.1:52648] (q.m.a.allowed) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)] ACL-1001 : Allowed : Access Virtualhost NAME=default,VIRTUALHOST_NAME=default
      2021-05-10 06:30:54,709 INFO  [IO-/127.0.0.1:52648] (q.m.c.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:0] CHN-1001 : Create
      2021-05-10 06:30:54,739 INFO  [IO-/127.0.0.1:52648] (q.m.c.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] CHN-1001 : Create
      2021-05-10 06:30:54,771 INFO  [VirtualHostNode-default-Config] (q.m.a.allowed) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] ACL-1001 : Allowed : Consume Queue NAME=broadcast.VTVTV_YYYYXXXXXZZZSSS.TradeConfirmation,DURABLE=true,EXCLUSIVE=false,TEMPORARY=false,AUTO_DELETE=false,VIRTUALHOST_NAME=default,CREATED_BY=admin
      2021-05-10 06:30:54,771 INFO  [VirtualHostNode-default-Config] (q.m.s.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,978(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.TradeConfirmation)] SUB-1001 : Create
      2021-05-10 06:30:54,799 INFO  [VirtualHostNode-default-Config] (q.m.a.allowed) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] ACL-1001 : Allowed : Consume Queue NAME=broadcast.VTVTV_YYYYXXXXXZZZSSS.Workflow,DURABLE=true,EXCLUSIVE=false,TEMPORARY=false,AUTO_DELETE=false,VIRTUALHOST_NAME=default,CREATED_BY=admin
      2021-05-10 06:30:54,799 INFO  [VirtualHostNode-default-Config] (q.m.s.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,979(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.Workflow)] SUB-1001 : Create
      2021-05-10 06:30:54,816 INFO  [VirtualHostNode-default-Config] (q.m.a.allowed) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] ACL-1001 : Allowed : Consume Queue NAME=broadcast.VTVTV_YYYYXXXXXZZZSSS.Public,DURABLE=true,EXCLUSIVE=false,TEMPORARY=false,AUTO_DELETE=false,VIRTUALHOST_NAME=default,CREATED_BY=admin
      2021-05-10 06:30:54,816 INFO  [VirtualHostNode-default-Config] (q.m.s.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,980(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.Public)] SUB-1001 : Create
      2021-05-10 06:30:54,834 INFO  [VirtualHostNode-default-Config] (q.m.a.allowed) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] ACL-1001 : Allowed : Consume Queue NAME=response.VTVTV_YYYYXXXXXZZZSSS,DURABLE=true,EXCLUSIVE=false,TEMPORARY=false,AUTO_DELETE=false,VIRTUALHOST_NAME=default,CREATED_BY=admin
      2021-05-10 06:30:54,834 INFO  [VirtualHostNode-default-Config] (q.m.s.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,981(vh(/default)/qu(response.VTVTV_YYYYXXXXXZZZSSS)] SUB-1001 : Create
      2021-05-10 06:30:54,856 INFO  [IO-/127.0.0.1:52648] (q.m.s.create) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] SND-1001 : Create : qpid-jms:sender:ID:e72a83ab-2332-4e64-b891-0bf26cb8f632:1:1:1:request.VTVTV_YYYYXXXXXZZZSSS : request.VTVTV_YYYYXXXXXZZZSSS
      2021-05-10 10:01:40,576 INFO  [IO-/127.0.0.1:52648] (q.m.c.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:0] [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:0] CHN-1003 : Close
      2021-05-10 10:01:40,576 INFO  [VirtualHostNode-default-Config] (q.m.s.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,978(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.TradeConfirmation)] SUB-1002 : Close
      2021-05-10 10:01:40,576 INFO  [IO-/127.0.0.1:52648] (q.m.s.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] SND-1002 : Close : qpid-jms:sender:ID:e72a83ab-2332-4e64-b891-0bf26cb8f632:1:1:1:request.VTVTV_YYYYXXXXXZZZSSS : request.VTVTV_YYYYXXXXXZZZSSS
      2021-05-10 10:01:40,577 INFO  [VirtualHostNode-default-Config] (q.m.s.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,980(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.Public)] SUB-1002 : Close
      2021-05-10 10:01:40,578 INFO  [VirtualHostNode-default-Config] (q.m.s.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,981(vh(/default)/qu(response.VTVTV_YYYYXXXXXZZZSSS)] SUB-1002 : Close
      2021-05-10 10:01:40,578 INFO  [VirtualHostNode-default-Config] (q.m.s.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [sub:36,979(vh(/default)/qu(broadcast.VTVTV_YYYYXXXXXZZZSSS.Workflow)] SUB-1002 : Close
      2021-05-10 10:01:40,579 INFO  [IO-/127.0.0.1:52648] (q.m.c.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)/ch:1] CHN-1003 : Close
      2021-05-10 10:01:40,579 INFO  [IO-/127.0.0.1:52648] (o.a.q.s.p.v.AMQPConnection_1_0Impl) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@/127.0.0.1:52648/default)]  : Connection closed with error : amqp:resource-limit-exceeded - local-idle-timeout expired
      2021-05-10 10:01:40,644 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - bufCount: 2
      2021-05-10 10:01:40,644 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - _status: [ Status = CLOSED, HandshakeStatus = NOT_HANDSHAKING, bytesConsumed = 0, bytesProduced = 0 ], _sslEngine.isOutboundDone(): true, _sslEngine.isInboundDone(): true, _sslEngine.getPeerHost(): null, _sslEngine.getPeerPort(): -1, buffers.size(): 2, _encryptedOutput.size(): 0, _netOutputBuffer.remaining(): 152185, _sslEngine.getSession().getPacketBufferSize(): 16709, _sslEngine.getSession().getCreationTime(): 1620621054549, _sslEngine.getSession().getLastAccessedTime(): 1620621054549, _sslEngine.getHandshakeSession().isValid(): null handshake session
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - buffers.size(): 2, bufsSent: false, bytesWritten: -1, _encryptedOutput: 0
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnection) - Looping detected: false, 0, 2
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - bufCount: 2
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - _status: [ Status = CLOSED, HandshakeStatus = NOT_HANDSHAKING, bytesConsumed = 0, bytesProduced = 0 ], _sslEngine.isOutboundDone(): true, _sslEngine.isInboundDone(): true, _sslEngine.getPeerHost(): null, _sslEngine.getPeerPort(): -1, buffers.size(): 2, _encryptedOutput.size(): 0, _netOutputBuffer.remaining(): 152185, _sslEngine.getSession().getPacketBufferSize(): 16709, _sslEngine.getSession().getCreationTime(): 1620621054549, _sslEngine.getSession().getLastAccessedTime(): 1620621054549, _sslEngine.getHandshakeSession().isValid(): null handshake session
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - buffers.size(): 2, bufsSent: false, bytesWritten: -1, _encryptedOutput: 0
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnection) - Looping detected: false, 0, 2
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - bufCount: 2
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - _status: [ Status = CLOSED, HandshakeStatus = NOT_HANDSHAKING, bytesConsumed = 0, bytesProduced = 0 ], _sslEngine.isOutboundDone(): true, _sslEngine.isInboundDone(): true, _sslEngine.getPeerHost(): null, _sslEngine.getPeerPort(): -1, buffers.size(): 2, _encryptedOutput.size(): 0, _netOutputBuffer.remaining(): 152185, _sslEngine.getSession().getPacketBufferSize(): 16709, _sslEngine.getSession().getCreationTime(): 1620621054549, _sslEngine.getSession().getLastAccessedTime(): 1620621054549, _sslEngine.getHandshakeSession().isValid(): null handshake session
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - buffers.size(): 2, bufsSent: false, bytesWritten: -1, _encryptedOutput: 0
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnection) - Looping detected: false, 0, 2
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - bufCount: 2
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - _status: [ Status = CLOSED, HandshakeStatus = NOT_HANDSHAKING, bytesConsumed = 0, bytesProduced = 0 ], _sslEngine.isOutboundDone(): true, _sslEngine.isInboundDone(): true, _sslEngine.getPeerHost(): null, _sslEngine.getPeerPort(): -1, buffers.size(): 2, _encryptedOutput.size(): 0, _netOutputBuffer.remaining(): 152185, _sslEngine.getSession().getPacketBufferSize(): 16709, _sslEngine.getSession().getCreationTime(): 1620621054549, _sslEngine.getSession().getLastAccessedTime(): 1620621054549, _sslEngine.getHandshakeSession().isValid(): null handshake session
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - buffers.size(): 2, bufsSent: false, bytesWritten: -1, _encryptedOutput: 0
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnection) - Looping detected: false, 0, 2
      2021-05-10 10:01:40,645 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - bufCount: 2
      2021-05-10 10:01:40,646 WARN  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnectionTLSDelegate) - _status: [ Status = CLOSED, HandshakeStatus = NOT_HANDSHAKING, bytesConsumed = 0, bytesProduced = 0 ], _sslEngine.isOutboundDone(): true, _sslEngine.isInboundDone(): true, _sslEngine.getPeerHost(): null, _sslEngine.getPeerPort(): -1, buffers.size(): 2, _encryptedOutput.size(): 0, _netOutputBuffer.remaining(): 152185, _sslEngine.getSession().getPacketBufferSize(): 16709, _sslEngine.getSession().getCreationTime(): 1620621054549, _sslEngine.getSession().getLastAccessedTime(): 1620621054549, _sslEngine.getHandshakeSession().isValid(): null handshake session
      2021-05-10 10:01:40,646 INFO  [IO-/127.0.0.1:52648] (o.a.q.s.t.NonBlockingConnection) - Exception performing final write/close for '/127.0.0.1:52648': Looping detected, executing circuit breaker
      2021-05-10 10:01:40,647 INFO  [Broker-Config] (q.m.c.close) - [con:9,435,937(VTVTV_YYYYXXXXXZZZSSS@null/default)] CON-1002 : Close
      

      Noticeable is SSLEngineResult.Status.CLOSED as well as SSLEngineResult.HandshakeStatus NOT_HANDSHAKING. Combination of those statuses indicates that the last operation closed broker side of the SSLEngine, or the operation could not be completed because it was already closed.

      Attachments

        Activity

          People

            Unassigned Unassigned
            daniel.kirilyuk Daniil Kirilyuk
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: