Details
-
Improvement
-
Status: Resolved
-
Low
-
Resolution: Fixed
-
None
-
None
-
JVM
Description
When compression is on, Currently we see checksum taking up about 40% of the CPU more than snappy library.
Looks like hadoop solved it by implementing their own checksum, we can either use it or implement something like that.
http://images.slidesharecdn.com/1toddlipconyanpeichen-cloudera-hadoopandperformance-final-111110132228-phpapp01-slide-15-768.jpg?1321043717
in our test env it provided 50% improvement over native implementation which uses jni to call the OS.