Details
-
Improvement
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
Description
I could not find a situation in which Adler32 outperformed PureJavaCrc32 much less the intrinsic from Java 8. For small allocations PureJavaCrc32 was much faster probably due to the JNI overhead of invoking the native Adler32 implementation where the array has to be allocated and copied.
I tested on a 65w Sandy Bridge i5 running Ubuntu 14.04 with JDK 1.7.0_71 as well as a c3.8xlarge running Ubuntu 14.04.
I think it makes sense to stop using Adler32 when generating new checksums.
c3.8xlarge, results are time in milliseconds, lower is better
Allocation size | Adler32 | CRC32 | PureJavaCrc32 |
---|---|---|---|
64 | 47636 | 46075 | 25782 |
128 | 36755 | 36712 | 23782 |
256 | 31194 | 32211 | 22731 |
1024 | 27194 | 28792 | 22010 |
1048576 | 25941 | 27807 | 21808 |
536870912 | 25957 | 27840 | 21836 |
i5
Allocation size | Adler32 | CRC32 | PureJavaCrc32 |
---|---|---|---|
64 | 50539 | 50466 | 26826 |
128 | 37092 | 38533 | 24553 |
256 | 30630 | 32938 | 23459 |
1024 | 26064 | 29079 | 22592 |
1048576 | 24357 | 27911 | 22481 |
536870912 | 24838 | 28360 | 22853 |
Another fun fact. Performance of the CRC32 intrinsic appears to double from Sandy Bridge -> Haswell. Unless I am measuring something different when going from Linux/Sandy to Haswell/OS X.
The intrinsic/JDK 8 implementation also operates against DirectByteBuffers better and coding against the wrapper will get that boost when run with Java 8.
Attachments
Attachments
Issue Links
- depends upon
-
CASSANDRA-8614 Select optimal CRC32 implementation at runtime
- Resolved
- is related to
-
CASSANDRA-13908 Add JDK9 new CRC32C checksum
- Open
- relates to
-
CASSANDRA-7130 Make sstable checksum type configurable and optional
- Resolved