Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
-
None
-
None
Description
In the Compaction Observability the Initiator/Worker/Cleaner cycle is measured with a Dropwizard Timer metrics.
Timers A timer measures both the rate that a particular piece of code is called and the distribution of its duration.
However this is not good to measure simply a duration. Furthermore, one HMS can run multiple Worker threads and the duration of the last finished worker is not really informative if a Worker thread got stuck.
Timers do not carry enough information because they only bump the counter if a Worker has finished a loop.
If Initiator/Worker/Cleaner gets stuck, then the metrics is not provided hence it didn't bump the counter.
It'd better to implement the followings:
- Time passed since Initiator start (single threaded) -> Gauge metric
- Oldest Working compaction -> Gauge Metric
- Oldest Working Cleaner -> Gauge metric
Attachments
Issue Links
- is related to
-
HIVE-25914 Cleaner updates Initiator cycle metric
- Closed