To improve the debugging of scheduler issues it would be a big improvement to be able to dump the scheduler state into a log on request.
The Dump the scheduler state at a point in time would allow debugging of a scheduler that is not hung (deadlocked) but also not assigning containers. Currently we do not have a proper overview of what state the scheduler and the queues are in and we have to make assumptions or guess
The scheduler and queue state needed would include (not exhaustive):
- instantaneous and steady fair share (app / queue)
- AM share and resources
- app demand
- application run state (runnable/non runnable)
- last time at fair/min share