Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-7333

Performance improvement in PureJavaCrc32



    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.21.0
    • 0.23.0
    • performance, util
    • None
    • Linux x64

    • Reviewed


      I would like to propose a small patch to

      org.apache.hadoop.util.PureJavaCrc32.update(byte[] b, int off, int len)

      Currently the method stores the intermediate result back into the data member "crc." I noticed this method gets
      inlined into DataChecksum.update() and that method appears as one of the hotter methods in a simple hprof profile collected while running terasort and gridmix.

      If the code is modified to save the temporary result into a local and just once store the final result back into the data member, it results in slightly more efficient hotspot codegen.

      I tested this change using the the "org.apache.hadoop.util.TestPureJavaCrc32$PerformanceTest" which is embedded in the existing unit test for this class, TestPureJavaCrc32 on a variety of linux x64 AMD and Intel multi-socket and multi-core systems I have available to test.

      The patch removes several stores of the intermediate result to memory yielding a 0%-10% speedup in the "org.apache.hadoop.util.TestPureJavaCrc32$PerformanceTest" which is embedded in the existing unit test for this class, TestPureJavaCrc32.

      If you use a debug hotspot JVM with -XX:+PrintOptoAssembly, you can see the intermediate stores such as:

      414 movq R9, rsp + #24 # spill
      419 movl R9 + #12 (8-bit), RDX # int ! Field PureJavaCrc32.crc
      41d xorl R10, RDX # int

      The patch results in just one final store of the fully computed value.


        1. HADOOP-7333.patch
          1 kB
          Eric Caspole
        2. c7333_20110526.patch
          31 kB
          Tsz-wo Sze

        Issue Links



              ecaspole Eric Caspole
              ecaspole Eric Caspole
              0 Vote for this issue
              7 Start watching this issue