Details
-
Improvement
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Performance
-
Normal
-
All
-
None
-
Description
Every 5 seconds with the default read_request_timeout (or in the old naming scheme read_request_timeout_in_ms), a scheduled task updates the speculation thresholds (for reads and writes) for all active tables. However, there are a few issues with the way we do this:
1.) Whether or not the SpeculativeRetryPolicy implementations in use actually looks at them, we create latency histogram snapshots to pass to calculateThreshold(). We could trivially avoid this by having the method take an argument of type Sampling and build the snapshot only when necessary.
2.) The only reason we build the histogram snapshot is to find the new threshold value for the percentile based policies. EstimatedHistogramReservoirSnapshot creates copies of both the decaying and non-decaying buckets, but we don’t use the non-decaying values at all for percentile calculation. Just avoiding the non-decaying values array creation would cut allocations in half.
Given even our snapshots aren’t perfectly consistent, it might also be possible to calculate a percentile value directly from the reservoir’s decaying buckets, although that might be less accurate, as new values could be added to the buckets after a count is calculated.