Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-8020

buffer corruption in SSL communications

    XMLWordPrintableJSON

Details

    Description

      update: May 8, 2020: the main problem described here seemed to only occur on JDK8 when TLSv1 is used. JDK11 with TLSv1 doesn't exhibit the problem. Nor is the problem apparent when TLSv1.2 is used on either JDK. This issue is marked resolved but the problem still occurs on JDK8 with TLSv1. Recommend customers use TLSv1.2 or later. Other buffering problems were found in this investigation and a PR was merged to address those.

      When running an application with SSL enabled I ran into a hang with a lost message. The sender had a 15 second ack-wait warning pointing to another server in the cluster. That server had this in its log file at the time the message would have been processed:

      [info 2020/04/21 11:22:39.437 PDT <P2P message reader for rs-bschuchardt-1053-hydra-client-1(bridgegemfire4_host1_12599:12599)<ec><v1>:41003 unshared ordered uid=354 dom #2 port=55262> tid=0xad] P2P message reader@2580db5f io exception for rs-bschuchardt-1053-hydra-client-1(bridgegemfire4_host1_12599:12599)<ec><v1>:41003@354(GEODE 1.10.0)
      javax.net.ssl.SSLException: bad record MAC
      	at sun.security.ssl.Alerts.getSSLException(Alerts.java:214)
      	at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1728)
      	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:986)
      	at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:912)
      	at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:782)
      	at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:626)
      	at org.apache.geode.internal.net.NioSslEngine.unwrap(NioSslEngine.java:275)
      	at org.apache.geode.internal.tcp.Connection.processInputBuffer(Connection.java:2894)
      	at org.apache.geode.internal.tcp.Connection.readMessages(Connection.java:1745)
      	at org.apache.geode.internal.tcp.Connection.run(Connection.java:1577)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: javax.crypto.BadPaddingException: bad record MAC
      	at sun.security.ssl.InputRecord.decrypt(InputRecord.java:219)
      	at sun.security.ssl.EngineInputRecord.decrypt(EngineInputRecord.java:177)
      	at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:979)
      	... 10 more
      

      I bisected to see when this problem was introduced and found it was this commit:

      commit 418d929e3e03185cd6330c828c9b9ed395a76d4b
      Author: Mario Ivanac <48509724+mivanac@users.noreply.github.com>
      Date:   Fri Nov 1 20:28:57 2019 +0100
      
          GEODE-6661: Fixed use of Direct and Non-Direct buffers (#4267)
      
          - Fixed use of Direct and Non-Direct buffers
      

      That commit modified the NioSSLEngine to use a "direct" byte buffer instead of a heap byte buffer. If I revert that one part of the PR the test works okay.

      Attachments

        Activity

          People

            bschuchardt Bruce J Schuchardt
            bschuchardt Bruce J Schuchardt
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: