[CASSANDRA-17175] More detailed latency metrics - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Normal
Resolution: Unresolved
Fix Version/s: 5.x
Component/s: Observability/Metrics
Labels:
None

Change Category:
Operability
Complexity:
Normal
Platform:

All
Impacts:

None

Description

There is a disconnect with latency clients experience and the latency reported by Cassandra. For example read latency only measures the latency of the StorageProxy::readRows call.

None of the time spent sitting in the Native Transport queue is measured. Neither is any of the time for writing the response back to the channel.

Dispatcher processRequest keeps track of when it first starts processing the request but best I can tell this is only used in tracking for timeouts.

It would be useful for tracking down cause of high client latency if there were more detailed Cassandra metrics around it.

I have attached a patch that adds latency tracking higher in the call stack. Starting timer from before it is put into the Native Transport Request executor. The patch gives 3 different metrics per Request type:

delay - measures time from when it is submitted to NTR pool until it call processRequest

process - time spent in the Dispatcher processRequest call

total - time from when first submitted to NTR pool until the response has been flushed

This patch may not be cleanest or best way of doing this but hopefully gives an idea of what I think would be useful addition that will help operators diagonse latency issues.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

request_latency_metric.patch
29/Nov/21 23:49
12 kB
Cameron Zemek

Activity

People

Assignee:: Stefan Miklosovic

Reporter:: Cameron Zemek

Authors:: Stefan Miklosovic

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 29/Nov/21 23:57

Updated:: 07/Mar/23 10:54