Details
-
Epic
-
Status: Accepted
-
Critical
-
Resolution: Unresolved
-
None
-
None
-
Allocator metrics
Description
There are currently no metrics that provide visibility into the allocator, except for the event queue size. This makes monitoring an debugging allocation behavior in a multi-framework setup difficult.
Some thoughts for initial metrics to add:
- How many allocation runs have completed? (counter):
MESOS-4718 - How many offers has each role / framework received? (counter): MESOS-4719
- Current allocation breakdown: allocated / available / total (gauges):
MESOS-4720 - Current maximum shares (gauges):
MESOS-4724 - How many active filters are there for the role / framework? (gauges):
MESOS-4722 - How many frameworks are suppressing offers? (gauges)
- How long does an allocation run take? (timers):
MESOS-4721 - Maintenance related metrics:
- How many maintenance events are active? (gauges)
- How many maintenance events are scheduled but not active (gauges)
- Quota related metrics:
- How much quota is set for each role? (gauges)
- How much quota is satisfied? How much unsatisfied? (gauges):
MESOS-4723
Some of these are already exposed from the master's metrics, but we should not assume this within the allocator.
Attachments
Issue Links
- is blocked by
-
MESOS-4783 Disable rate limiting of the global metrics endpoint for mesos-tests execution
- Resolved