Details
Description
This change will make the pseudo-random number generator (PRNG) implementation used by the SSLContext configurable. The configuration is not required, and the default is to use whatever the default PRNG for the JDK/JRE is. Providing a string, such as "SHA1PRNG", will cause that specific SecureRandom implementation to get passed to the SSLContext.
When enabling inter-broker SSL in our certification cluster, we observed severe performance issues. For reference, this cluster can take up to 600 MB/sec of inbound produce traffic over SSL, with RF=2, before it gets close to saturation, and the mirror maker normally produces about 400 MB/sec (unless it is lagging). When we enabled inter-broker SSL, we saw persistent replication problems in the cluster at any inbound rate of more than about 6 or 7 MB/sec per-broker. This was narrowed down to all the network threads blocking on a single lock in the SecureRandom code.
It turns out that the default PRNG implementation on Linux is NativePRNG. This uses randomness from /dev/urandom (which, by itself, is a non-blocking read) and mixes it with randomness from SHA1. The problem is that the entire application shares a single SecureRandom instance, and NativePRNG has a global lock within the implNextBytes method. Switching to another implementation (SHA1PRNG, which has better performance characteristics and is still considered secure) completely eliminated the bottleneck and allowed the cluster to work properly at saturation.
The SSLContext initialization has an optional argument to provide a SecureRandom instance, which the code currently sets to null. This change creates a new config to specify an implementation, and instantiates that and passes it to SSLContext if provided. This will also let someone select a stronger source of randomness (obviously at a performance cost) if desired.
Attachments
Issue Links
- links to