If the built-in Prometheus metrics feature introduced after version 3.6 is enabled, under high-load scenarios (such as when there are a large number of read requests), the percentile metrics (Summary) used to collect request latencies can easily become a bottleneck and impact the service itself. This is because the internal implementation of Summary involves the overhead of lock operations. In scenarios with a large number of requests, lock contention can lead to a dramatic deterioration in request latency. The details of this issue and related profiling can be viewed in ZOOKEEPER-4741.
ZOOKEEPER-4289, the updates to Summary were switched to be executed in a separate thread pool. While this approach avoids the overhead of lock contention caused by multiple threads updating Summary simultaneously, it introduces the operational overhead of the thread pool queue and additional garbage collection (GC) overhead. Especially when the thread pool queue is full, a large number of RejectedExecutionException instances will be thrown, further increasing the pressure on GC.
To address problems above, I have implemented an almost lock-free solution based on DataSketches. Benchmark results show that it offers over a 10x speed improvement compared to version 3.9.1 and avoids frequent GC caused by creating a large number of temporary objects. The trade-off is that the latency percentiles will be displayed with a relative delay (default is 60 seconds), and each Summary metric will have a certain amount of permanent memory overhead.
This solution refers to Matteo Merli's optimization work on the percentile latency metrics for Bookkeeper, as detailed in https://github.com/apache/bookkeeper/commit/3bff19956e70e37c025a8e29aa8428937af77aa1.