Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Code
-
Normal
-
Normal
-
Adhoc Test
-
All
-
None
-
Description
When testing a cluster with more requests than it could handle, I noticed significant CPU time (25%) spent in HintsStore.getTotalFileSize. Here's what I'm seeing from profiling:
10% of CPU time spent in HintsDescriptor.fileName which only does this:
return String.format("%s-%s-%s.hints", hostId, timestamp, version);
At a bare minimum here we should create this string up front with the host and version and eliminate 2 of the 3 substitutions, but I think it's probably faster to use a StringBuilder and avoid the underlying regular expression altogether.
12% of the time is spent in org.apache.cassandra.io.util.File.length. It looks like this is called once for each hint file on disk for each host we're hinting to. In the case of an overloaded cluster, this is significant. It would be better if we were to track the file size in memory for each hint file and reference that rather than go to the filesystem.
These fairly small changes should make Cassandra more reliable when under load spikes.
CPU Flame graph attached.
I only tested this in 4.1 but it looks like this is present up to trunk.
Attachments
Attachments
Issue Links
- fixes
-
CASSANDRA-19485 if max_hints_size_per_host < max_hints_file_size then it will write hints after max_hints_size_per_host is reached
- Resolved
- links to