Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-16011

OsSecureRandom very slow compared to other SecureRandom implementations

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.3.0
    • security
    • None
    • Hide
      The default RNG is now OpensslSecureRandom instead of OsSecureRandom.The high-performance hardware random number generator (RDRAND instruction) will be used if available. If not, it will fall back to OpenSSL secure random generator.
      If you insist on using OsSecureRandom, set hadoop.security.secure.random.impl in core-site.xml to org.apache.hadoop.crypto.random.OsSecureRandom.
      Show
      The default RNG is now OpensslSecureRandom instead of OsSecureRandom.The high-performance hardware random number generator (RDRAND instruction) will be used if available. If not, it will fall back to OpenSSL secure random generator. If you insist on using OsSecureRandom, set hadoop.security.secure.random.impl in core-site.xml to org.apache.hadoop.crypto.random.OsSecureRandom.

    Description

      In looking at performance of a workload which creates a lot of short-lived remote connections to a secured DN, philip and I found very high system CPU usage. We tracked it down to reads from /dev/random, which are incurred by the DN using CryptoCodec.generateSecureRandom to generate a transient session key and IV for AES encryption.

      In the case that the OpenSSL codec is not enabled, the above code falls through to the JDK SecureRandom implementation, which performs reasonably. However, OpenSSLCodec defaults to using OsSecureRandom, which reads all random data from /dev/random rather than doing something more efficient like initializing a CSPRNG from a small seed.

      I wrote a simple JMH benchmark to compare various approaches when running with concurrency 10:
      testHadoop - using CryptoCodec
      testNewSecureRandom - using 'new SecureRandom()' each iteration
      testSha1PrngNew - using the SHA1PRNG explicitly, new instance each iteration
      testSha1PrngShared - using a single shared instance of SHA1PRNG
      testSha1PrngThread - using a thread-specific instance of SHA1PRNG

      Benchmark                         Mode  Cnt        Score   Error  Units
      MyBenchmark.testHadoop           thrpt          1293.000          ops/s  [with libhadoop.so]
      MyBenchmark.testHadoop           thrpt        461515.697          ops/s [without libhadoop.so]
      MyBenchmark.testNewSecureRandom  thrpt         43413.640          ops/s
      MyBenchmark.testSha1PrngNew      thrpt        395515.000          ops/s
      MyBenchmark.testSha1PrngShared   thrpt        164488.713          ops/s
      MyBenchmark.testSha1PrngThread   thrpt       4295123.210          ops/s
      

      In other words, the presence of the OpenSSL acceleration slows down this code path by 356x. And, compared to the optimal (thread-local Sha1Prng) it's 3321x slower.

      Attachments

        1. HADOOP-16011.001.patch
          1 kB
          Siyao Meng
        2. HADOOP-16011.002.patch
          2 kB
          Siyao Meng
        3. MyBenchmark.java
          3 kB
          Todd Lipcon

        Activity

          People

            smeng Siyao Meng
            tlipcon Todd Lipcon
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: