Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-2561

Optionally support OpenSSL for SSL/TLS

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.9.0.0
    • Fix Version/s: None
    • Component/s: security
    • Labels:
      None

      Description

      JDK's `SSLEngine` is unfortunately a bit slow (KAFKA-2431 covers this in more detail). We should consider supporting OpenSSL for SSL/TLS. Initial experiments on my laptop show that it performs a lot better:

      start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, config
      2015-09-21 14:41:58:245, 2015-09-21 14:47:02:583, 28610.2295, 94.0081, 30000000, 98574.6111, Java 8u60/server auth JDK SSLEngine/TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA
      2015-09-21 14:38:24:526, 2015-09-21 14:40:19:941, 28610.2295, 247.8900, 30000000, 259931.5514, Java 8u60/server auth OpenSslEngine/TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
      2015-09-21 14:49:03:062, 2015-09-21 14:50:27:764, 28610.2295, 337.7751, 30000000, 354182.9000, Java 8u60/plaintext
      

      Extracting the throughput figures:

      • JDK SSLEngine: 94 MB/s
      • OpenSSL SSLEngine: 247 MB/s
      • Plaintext: 337 MB/s (code from trunk, so no zero-copy due to KAFKA-2517)

      In order to get these figures, I used Netty's `OpenSslEngine` by hacking `SSLFactory` to use Netty's `SslContextBuilder` and made a few changes to `SSLTransportLayer` in order to workaround differences in behaviour between `OpenSslEngine` and JDK's SSLEngine (filed https://github.com/netty/netty/issues/4235 and https://github.com/netty/netty/issues/4238 upstream).

        Activity

        Hide
        jaikiran jaikiran pai added a comment -

        > Later this week, I plan to rerun the same thing with Java 9 and see how it performs.

        The same blog[1] has now been updated to include the performance numbers when Java 9 was used. To summarize, there's a drastic improvement in the SSLEngine shipped in JRE 9 as compared to JRE 8. OpenSSL (backed by WildFly OpenSSL), however, still out-performs the default shipped SSLEngine even in Java 9.

        [1] https://jaitechwriteups.blogspot.com/2017/10/kafka-with-openssl.html

        Show
        jaikiran jaikiran pai added a comment - > Later this week, I plan to rerun the same thing with Java 9 and see how it performs. The same blog [1] has now been updated to include the performance numbers when Java 9 was used. To summarize, there's a drastic improvement in the SSLEngine shipped in JRE 9 as compared to JRE 8. OpenSSL (backed by WildFly OpenSSL), however, still out-performs the default shipped SSLEngine even in Java 9. [1] https://jaitechwriteups.blogspot.com/2017/10/kafka-with-openssl.html
        Hide
        jaikiran jaikiran pai added a comment -

        I just came across this JIRA, so I thought I will update it with my own recent experiments with OpenSSL (Java 8) and Kafka. For those interested, I got some performance numbers OpenSSL (backed by WildFly OpenSSL Java bindings) and have detailed them in my blog[1]. Later this week, I plan to rerun the same thing with Java 9 and see how it performs.

        [1] https://jaitechwriteups.blogspot.com/2017/10/kafka-with-openssl.html

        Show
        jaikiran jaikiran pai added a comment - I just came across this JIRA, so I thought I will update it with my own recent experiments with OpenSSL (Java 8) and Kafka. For those interested, I got some performance numbers OpenSSL (backed by WildFly OpenSSL Java bindings) and have detailed them in my blog [1] . Later this week, I plan to rerun the same thing with Java 9 and see how it performs. [1] https://jaitechwriteups.blogspot.com/2017/10/kafka-with-openssl.html
        Hide
        salyh Hendrik Saly added a comment -

        Here is a working draft: https://github.com/salyh/kafka/commit/9337c56df9b8387bf42f756faf5be08118259139

        First sketch to make SslFactory ready for native OpenSSl support leveraging netty and netty tcnative.
        Requires netty 4.0.30 (common, handler, buffer, codec) and tcnative fork-1.1.33.19 for the respective OS and of course OpenSSL installed (recent 1.0.1 or better 1.0.2). Could not get the gradle dependency stuff to work so maybe one can add the required dependencies.

        Show
        salyh Hendrik Saly added a comment - Here is a working draft: https://github.com/salyh/kafka/commit/9337c56df9b8387bf42f756faf5be08118259139 First sketch to make SslFactory ready for native OpenSSl support leveraging netty and netty tcnative. Requires netty 4.0.30 (common, handler, buffer, codec) and tcnative fork-1.1.33.19 for the respective OS and of course OpenSSL installed (recent 1.0.1 or better 1.0.2). Could not get the gradle dependency stuff to work so maybe one can add the required dependencies.
        Hide
        thesquelched Scott Kruger added a comment -

        Now that Java 9 has been delayed until 2017, can this get a bump in priority?

        Show
        thesquelched Scott Kruger added a comment - Now that Java 9 has been delayed until 2017, can this get a bump in priority?
        Hide
        ijuma Ismael Juma added a comment - - edited

        Encryption speed improvements in JDK 9 (note the following doesn't include the `SSLEngine` overhead, which is probably still significant):

        • JDK 9: up to a 62x performance gain over the JDK 8 GA implementation
        • up to 5.45x over 8u60 implementation
        • 8u60 performance improved due to https://bugs.openjdk.java.net/browse/JDK-8069072

        https://blogs.oracle.com/mullan/entry/slides_for_javaone_2015_session

        Show
        ijuma Ismael Juma added a comment - - edited Encryption speed improvements in JDK 9 (note the following doesn't include the `SSLEngine` overhead, which is probably still significant): • JDK 9: up to a 62x performance gain over the JDK 8 GA implementation • up to 5.45x over 8u60 implementation • 8u60 performance improved due to https://bugs.openjdk.java.net/browse/JDK-8069072 https://blogs.oracle.com/mullan/entry/slides_for_javaone_2015_session
        Hide
        ijuma Ismael Juma added a comment -

        The current plan is to go with the JDK implementation for 0.9.0.0. As you say, there isn't enough time to do anything else. If and when we add an implementation based on OpenSSL, it will be optional. The default engine can be discussed when we get to that point.

        I doubt that the Java 9 implementation will be faster than OpenSSL, but it will hopefully narrow the gap. In any case, that's quite far away as you said. And people take their time to upgrade as well.

        I filed this issue to record and share our findings, but we are not planning to do additional work on this until after 0.9.0.0 now.

        Show
        ijuma Ismael Juma added a comment - The current plan is to go with the JDK implementation for 0.9.0.0. As you say, there isn't enough time to do anything else. If and when we add an implementation based on OpenSSL, it will be optional. The default engine can be discussed when we get to that point. I doubt that the Java 9 implementation will be faster than OpenSSL, but it will hopefully narrow the gap. In any case, that's quite far away as you said. And people take their time to upgrade as well. I filed this issue to record and share our findings, but we are not planning to do additional work on this until after 0.9.0.0 now.
        Hide
        becket_qin Jiangjie Qin added a comment -

        Ismael Juma Thanks for the explanation. It looks Java 9 is still one year away from release. I am not sure what should we do now. If might take several weeks as you suggested to implement the patch and some time longer to make it stable. A few months later, if Java 9 has better performance than OpenSSL, are we going to switch back to JDK? What do you think?

        Show
        becket_qin Jiangjie Qin added a comment - Ismael Juma Thanks for the explanation. It looks Java 9 is still one year away from release. I am not sure what should we do now. If might take several weeks as you suggested to implement the patch and some time longer to make it stable. A few months later, if Java 9 has better performance than OpenSSL, are we going to switch back to JDK? What do you think?
        Hide
        ijuma Ismael Juma added a comment -

        Jiangjie Qin, there are two problems:

        • The JDK SSLEngine implementation for AES-GCM doesn't use the relevant CPU instructions yet and a highly optimised implementation of AES-GCM is faster than a highly optimised implementation of AES-CBC. OpenSSL is twice as fast when using AES-GCM instead of AES-CBC in a SSLEngine micro-benchmark. In the same benchmark, JDK 8u60 with AES-CBC is four times faster than with AES-GCM. This issue should improve in Java 9 (http://openjdk.java.net/jeps/8046943).
        • The JDK SSLEngine implementation is not particularly well implemented and generates a lot of garbage based on what Netty's Norman Maurer said in a presentation.

        I don't think we can do much about it apart from waiting for Java 9 and potentially contributing improvements to the implementation. We are not the first to run into this and that is why Tomcat, Finagle and Netty have the OpenSSL implementation too.

        Also note that the results I posted in the description are for a test with a single broker and consumer running in the same machine. If you have many cores and they are lightly loaded, you may still be able to saturate the network in the meantime.

        Show
        ijuma Ismael Juma added a comment - Jiangjie Qin , there are two problems: The JDK SSLEngine implementation for AES-GCM doesn't use the relevant CPU instructions yet and a highly optimised implementation of AES-GCM is faster than a highly optimised implementation of AES-CBC. OpenSSL is twice as fast when using AES-GCM instead of AES-CBC in a SSLEngine micro-benchmark. In the same benchmark, JDK 8u60 with AES-CBC is four times faster than with AES-GCM. This issue should improve in Java 9 ( http://openjdk.java.net/jeps/8046943 ). The JDK SSLEngine implementation is not particularly well implemented and generates a lot of garbage based on what Netty's Norman Maurer said in a presentation. I don't think we can do much about it apart from waiting for Java 9 and potentially contributing improvements to the implementation. We are not the first to run into this and that is why Tomcat, Finagle and Netty have the OpenSSL implementation too. Also note that the results I posted in the description are for a test with a single broker and consumer running in the same machine. If you have many cores and they are lightly loaded, you may still be able to saturate the network in the meantime.
        Hide
        becket_qin Jiangjie Qin added a comment -

        Ismael Juma Just curious, have we found out why the performance differs so much? Is it possible that we can tweak some settings of JDK SslEngine to improve the performance?

        Show
        becket_qin Jiangjie Qin added a comment - Ismael Juma Just curious, have we found out why the performance differs so much? Is it possible that we can tweak some settings of JDK SslEngine to improve the performance?
        Hide
        ijuma Ismael Juma added a comment - - edited

        In order to implement this properly (as opposed to a simple test), the following steps are needed:

        1. Add an optional build dependency on netty-tcnative. This library contains a fork of tomcat native that is available in Maven and is available for major platforms (Linux, OS X, Windows). It also handles extracting the platform-specific JNI code at runtime (similar to snappy-java). apr and openssl need to be installed separately.
        2. Provide an implementation of `SSLEngine` based on OpenSSL. The easy option would be to add an optional dependency on `netty-handler`, which includes this. If this is not acceptable, there are some alternatives like extracting the code into a separate library or copying it into Kafka.
        3. Add a way to configure the `SSLEngine` implementation (OpenSSL or JDK).
        4. Change `SSLFactory` to build the appropriate `SSLEngine` based on the configuration added in `3`.
        5. Potentially introduce a runtime mechanism to select `OpenSslEngine` by default if the required libraries are present (since it's much faster)
        6. Potentially update `SSLTransportLayer` to handle differences in behaviour between the different `SSLEngine` implementations (the need for this depends on whether we the issues reported to Netty are fixed or not). The main one is that `OpenSslEngine.unwrap` consumes multiple SSL records (instead of just one) and it may produce a different number of SSL records (if they don't all fit into the application buffer).
        7. Use `allocateDirect` to allocate the buffers in `SSLTransportLayer` when using `OpenSslEngine` to avoid copies on each `wrap` and `unwrap` call.
        8. Design and implement the story around the formats for keys, certificates, key chains and certificate chains supported. OpenSSL doesn't understand the JKS format since it's Java-specific. Netty uses the `PKCS#8` format for keys and PEM format for chains when the OpenSSL engine is used.
        9. Update tests to test all `SSLEngine` implementations.

        Testing of this is more complicated than usual due to the native code aspect and we would have to test it in all of our supported platforms.

        Given the work that I've already done, it would probably take a couple of weeks to agree on the details and implement the code (including unit tests). Maybe another week for testing on the various platforms.

        Show
        ijuma Ismael Juma added a comment - - edited In order to implement this properly (as opposed to a simple test), the following steps are needed: 1. Add an optional build dependency on netty-tcnative. This library contains a fork of tomcat native that is available in Maven and is available for major platforms (Linux, OS X, Windows). It also handles extracting the platform-specific JNI code at runtime (similar to snappy-java). apr and openssl need to be installed separately. 2. Provide an implementation of `SSLEngine` based on OpenSSL. The easy option would be to add an optional dependency on `netty-handler`, which includes this. If this is not acceptable, there are some alternatives like extracting the code into a separate library or copying it into Kafka. 3. Add a way to configure the `SSLEngine` implementation (OpenSSL or JDK). 4. Change `SSLFactory` to build the appropriate `SSLEngine` based on the configuration added in `3`. 5. Potentially introduce a runtime mechanism to select `OpenSslEngine` by default if the required libraries are present (since it's much faster) 6. Potentially update `SSLTransportLayer` to handle differences in behaviour between the different `SSLEngine` implementations (the need for this depends on whether we the issues reported to Netty are fixed or not). The main one is that `OpenSslEngine.unwrap` consumes multiple SSL records (instead of just one) and it may produce a different number of SSL records (if they don't all fit into the application buffer). 7. Use `allocateDirect` to allocate the buffers in `SSLTransportLayer` when using `OpenSslEngine` to avoid copies on each `wrap` and `unwrap` call. 8. Design and implement the story around the formats for keys, certificates, key chains and certificate chains supported. OpenSSL doesn't understand the JKS format since it's Java-specific. Netty uses the `PKCS#8` format for keys and PEM format for chains when the OpenSSL engine is used. 9. Update tests to test all `SSLEngine` implementations. Testing of this is more complicated than usual due to the native code aspect and we would have to test it in all of our supported platforms. Given the work that I've already done, it would probably take a couple of weeks to agree on the details and implement the code (including unit tests). Maybe another week for testing on the various platforms.

          People

          • Assignee:
            Unassigned
            Reporter:
            ijuma Ismael Juma
          • Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

            Dates

            • Created:
              Updated:

              Development