Now CapacityScheduler has some issues to make sure ParentQueue always obeys its capacity limits, for example:
1) When allocating container of a parent queue, it will only check parentQueue.usage < parentQueue.max. If leaf queue allocated a container.size > (parentQueue.max - parentQueue.usage), parent queue can excess its max resource limit, as following example:
Queue-A2 is able to allocate container since its usage < max, but if we do that, A's usage can excess A.max.
2) When doing continous reservation check, parent queue will only tell children "you need unreserve some resource, so that I will less than my maximum resource", but it will not tell how many resource need to be unreserved. This may lead to parent queue excesses configured maximum capacity as well.
- ParentQueue will set its children's ResourceUsage.headroom, which means, maximum resource its children can allocate.
- ParentQueue will set its children's headroom to be (saying parent's name is "qA"): min(qA.headroom, qA.max - qA.used). This will make sure qA's ancestors' capacity will be enforced as well (qA.headroom is set by qA's parent).
- needToUnReserve is not necessary, instead, children can get how much resource need to be unreserved to keep its parent's resource limit.
- More over, with this,
YARN-3026will make a clear boundary between LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc.