Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-16305

Optimisation in SslTransportLayer:handshakeUnwrap stalls TLS handshake

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.7.0, 3.6.1
    • 3.8.0, 3.7.1
    • None
    • None

    Description

      Kafka allows users to configure custom SSLEngine via the SslEngineFactory interface. There have been attempts to use an OpenSSL based SSLEngine using netty-handler over the JDK based implementation for performance reasons.

      While trying to use a Netty/Openssl based SSLEngine, we observe that the server hangs while performing the TLS handshake. We observe the following logs

      2024-02-26 01:40:00,117 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state NOT_INITIALIZED
      2024-02-26 01:40:00,117 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [INFO ] Selector             - [SocketServer listenerType=ZK_BROKER, nodeId=101] XX-Gaurav: calling prepare channelId 127.0.0.1:60045-127.0.0.1:60046-0
      2024-02-26 01:40:00,117 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state NOT_INITIALIZED
      2024-02-26 01:40:00,117 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
      2024-02-26 01:40:00,117 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] SSLHandshake NEED_UNWRAP channelId 127.0.0.1:60045-127.0.0.1:60046-0, appReadBuffer pos 0, netReadBuffer pos 0, netWriteBuffer pos 0
      2024-02-26 01:40:00,117 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] SSLHandshake handshakeUnwrap channelId 127.0.0.1:60045-127.0.0.1:60046-0 doRead true
      2024-02-26 01:40:00,118 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] SSLHandshake post handshakeUnwrap: channelId 127.0.0.1:60045-127.0.0.1:60046-0 handshakeStatus NEED_UNWRAP status BUFFER_UNDERFLOW read 0
      2024-02-26 01:40:00,118 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] SSLHandshake post unwrap NEED_UNWRAP channelId 127.0.0.1:60045-127.0.0.1:60046-0, handshakeResult Status = BUFFER_UNDERFLOW HandshakeStatus = NEED_UNWRAP bytesConsumed = 0 bytesProduced = 0, appReadBuffer pos 0, netReadBuffer pos 0, netWriteBuffer pos 0 handshakeStatus NEED_UNWRAP
      2024-02-26 01:40:00,118 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
      2024-02-26 01:40:00,118 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
      2024-02-26 01:40:00,118 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
      2024-02-26 01:40:00,118 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
      2024-02-26 01:40:00,118 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
      2024-02-26 01:40:00,119 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
      2024-02-26 01:40:00,119 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
      2024-02-26 01:40:00,120 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [INFO ] Selector             - [SocketServer listenerType=ZK_BROKER, nodeId=101] XX-Gaurav: calling prepare channelId 127.0.0.1:60045-127.0.0.1:60046-0
      2024-02-26 01:40:00,120 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
      2024-02-26 01:40:00,120 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] XX-Gaurav ready isReady false channelId 127.0.0.1:60045-127.0.0.1:60046-0 state HANDSHAKE
      2024-02-26 01:40:00,120 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] SSLHandshake NEED_UNWRAP channelId 127.0.0.1:60045-127.0.0.1:60046-0, appReadBuffer pos 0, netReadBuffer pos 383, netWriteBuffer pos 0
      2024-02-26 01:40:00,120 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] SSLHandshake handshakeUnwrap channelId 127.0.0.1:60045-127.0.0.1:60046-0 doRead true
      2024-02-26 01:40:00,122 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] SSLHandshake post handshakeUnwrap: channelId 127.0.0.1:60045-127.0.0.1:60046-0 handshakeStatus NEED_UNWRAP status OK read 0
      2024-02-26 01:40:00,122 data-plane-kafka-network-thread-101-ListenerName(TEST)-SASL_SSL-0 [TRACE] SslTransportLayer    - [SslTransportLayer channelId=127.0.0.1:60045-127.0.0.1:60046-0 key=channel=java.nio.channels.SocketChannel[connected local=/127.0.0.1:60045 remote=/127.0.0.1:60046], selector=sun.nio.ch.KQueueSelectorImpl@18d201be, interestOps=1, readyOps=0] SSLHandshake post unwrap NEED_UNWRAP channelId 127.0.0.1:60045-127.0.0.1:60046-0, handshakeResult Status = OK HandshakeStatus = NEED_TASK bytesConsumed = 383 bytesProduced = 0, appReadBuffer pos 0, netReadBuffer pos 0, netWriteBuffer pos 0 handshakeStatus NEED_UNWRAP
      

      Lines 21 and 22 are of interest here as we observe that:
      1. handshakeStatus is NEED_UNWRAP and
      2. netReadBuffer.position() == 0

      The do-while loop in handshakeUnwrap() is exited because of (2) and as a result, the SSLEngine's state machine isn't handled properly.

      We should therefore, avoid the optimisation and simplify the condition to be just while(cont).

      Attachments

        Issue Links

          Activity

            People

              gnarula Gaurav Narula
              gnarula Gaurav Narula
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: