[HELIX-444] add per-participant partition count gauges to helix - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.7.1, 0.6.4
Component/s: None
Labels:
None

Description

We need a way to pull the known down partition counts out of DifferenceWithIdealState when an instance is offline, reducing the alert volume to solely the down instance notification. Without metrics from helix indicating the number of partitions hosted on a given participant, we can't reason as to which "DifferenceWithIdealState" counts are supposed to be down and which are an actually difference caused by something other than a node outage.
These should be produced on a per-participant, per-resource basis (ie., helix.i001.participantstatus.cluster.host.db.partitiongauge = 64 or whatever)

Attachments

Activity

People

Assignee:: Zhen Zhang

Reporter:: Zhen Zhang

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 14/May/14 00:10

Updated:: 25/Aug/14 17:33

Resolved:: 25/Aug/14 17:33