Details
Description
using NIOSSL transport, SSL handshakes for ~5000 connections easily stall a broker taking 100% CPU
I'm using version ActiveMQ 5.8, but it occurs on 5.9, 5.10 versions as well
doing some profiling, it showed up that the SSL handshake on broker side eats up ~90% of overall CPU time
by checking just the handshake status in very high frequency
top 3 methods sorted by own processor time:
com.sun.net.ssl.internal.ssl.SSLEngineImpl.getHandshakeStatus()
org.apache.activemq.transport.nio.NIOSSLTransport.doHandshake()
com.sun.net.ssl.internal.ssl.SSLEngineImpl.getHSStatus(javax.net.ssl.SSLEngineResult$HandshakeStatus)
the reason is the asynchronous nature of the SSL handshake with NIO, especially the execution of delegated tasks:
- NIOSSLTransport.doHandshake() executes delegated tasks using a TaskRunnerFactory asynchronously
- in the meantime it loops calling SSLEngine.getHandshakeStatus()
to improve the situation I did the following changes:
- run delegated tasks synchronously in method doHandshake (handshake status NEED_TASK) instead of asynchronously
- added some small wait cycles in method secureRead as there is not always data available with NIO (to further reduce the number of calls to SSLEngine.getHandshakeStatus)
after these changes the SSL handshake for several thousand connections in parallel was not a problem anymore