I would like to propose a small patch to
org.apache.hadoop.util.PureJavaCrc32.update(byte b, int off, int len)
Currently the method stores the intermediate result back into the data member "crc." I noticed this method gets
inlined into DataChecksum.update() and that method appears as one of the hotter methods in a simple hprof profile collected while running terasort and gridmix.
If the code is modified to save the temporary result into a local and just once store the final result back into the data member, it results in slightly more efficient hotspot codegen.
I tested this change using the the "org.apache.hadoop.util.TestPureJavaCrc32$PerformanceTest" which is embedded in the existing unit test for this class, TestPureJavaCrc32 on a variety of linux x64 AMD and Intel multi-socket and multi-core systems I have available to test.
The patch removes several stores of the intermediate result to memory yielding a 0%-10% speedup in the "org.apache.hadoop.util.TestPureJavaCrc32$PerformanceTest" which is embedded in the existing unit test for this class, TestPureJavaCrc32.
If you use a debug hotspot JVM with -XX:+PrintOptoAssembly, you can see the intermediate stores such as:
The patch results in just one final store of the fully computed value.