We found this problem when cluster is almost but not exhausted (93% used), scheduler kept allocating for an app but always fail to commit, this can blocking requests from other apps and parts of cluster resource can't be used.
Reproduce this problem:
(1) use DominantResourceCalculator
(2) cluster resource has empty resource type, for example: gpu=0
(3) scheduler allocates container for app1 who has reserved containers and whose queue limit or user limit reached(used + required > limit).
Reference codes in RegularContainerAllocator#assignContainer:
For example, resourceNeedToUnReserve can be <8GB, -6 cores, 0 gpu> when headRoom=<0GB, 8 vcores, 0 gpu> and capacity=<8GB, 2 vcores, 0 gpu>, needToUnreserve which is the result of Resources#greaterThan will be false. This is not reasonable because required resource did exceed the headroom and unreserve is needed.
After that, when reaching the unreserve process in RegularContainerAllocator#assignContainer, unreserve process will be skipped when shouldAllocOrReserveNewContainer is true (when required containers > reserved containers) and needToUnreserve is wrongly calculated to be false: