Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-10122

With TLSv1.3 and GCM-based cipher (the default), P2P Messaging Fails When Encrypted Data Limit is Reached

    XMLWordPrintableJSON

Details

    Description

      TLSv1.3 introduced [1] the ability to set per-algorithm limits on symmetric key usage lifetimes. Once a certain number of bytes have been encrypted, a KeyUpdate post-handshake message [2] is sent.

      With default settings, on Liberica JDK 11, Geode's P2P framework will negotiate TLSv1.3 with the TLS_AES_256_GCM_SHA384 cipher suite. Geode P2P messaging will eventually fail, with a "Tag mismatch!" IOException in shared ordered receivers, after a session has been in heavy use for days.

      We have not see this failure on TLSv1.2.

      The implementation of TLSv1.3 in the Java runtime provides a security property [3] to configure the encrypted data limit. The attached patch to P2PMessagingConcurrencyDUnitTest configures the limit large enough that the test makes it through the (P2P) TLS handshake but small enough so that the "Tag mismatch!" exception is encountered less than a minute later.

      The bug is caused by Geode’s NioSslEngine class’ ignorance of the “rehandshaking” phase of the TLS protocol [4]:

          Creation - ready to be configured.

          Initial handshaking - perform authentication and negotiate communication parameters.

          Application data - ready for application exchange.

          Rehandshaking - renegotiate communications parameters/authentication; handshaking data may be mixed with application data.

          Closure - ready to shut down connection.

      Geode's tcp.Connection and NioSslEngine classes (particularly wrap() and unwrap()), as they are currently implemented, fail to fully attend to the handshake status from javax.net.ssl.SSLEngine. As a result these Geode classes fail to respond to the KeyUpdate message, resulting in the "Tag mismatch!" IOException.

      When that exception is encountered, the Connection is destroyed and a new one created in its place. But users of the old Connection, waiting for acknowledgements, will never receive them. This can result in cluster-wide hangs.

      [1] https://datatracker.ietf.org/doc/html/rfc8446#section-5.5

      [2] https://www.ibm.com/docs/en/sdk-java-technology/8?topic=handshake-post-messages

      [3] https://docs.oracle.com/en/java/javase/11/security/java-secure-socket-extension-jsse-reference-guide.html#GUID-B970ADD6-1E9F-4C18-A26E-0679B50CC946

      [4] https://www.ibm.com/docs/en/sdk-java-technology/7.1?topic=sslengine-

      Attachments

        Activity

          People

            burcham Bill Burcham
            burcham Bill Burcham
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: