Thanks for your comments, Vinod Kumar Vavilapalli/Jian He:
- Main code comments from Vinod: *
checkNodeLabelExpression: NPEs on labelExpression can happen?
No, I removed checkings
FiCaSchedulerNode: exclusive, setters, getters -> exclusivePartition
They're not used by anybody, removed
1. Change to nodePartitionToLookAt: Done
2. Now all queues checks needResources
3. Renamed to hasPendingResourceRequest as suggested by Jian
checkResourceRequestMatchingNodeLabel can be moved into the application?
Moved to SchedulerUtils
checkResourceRequestMatchingNodeLabel nodeLabelToLookAt arg is not used anywhere else.
Done (merged it in SchedulerUtils.checkResourceRequestMatchingNodePartition)
Renamed to reset/addMissedNonPartitionedRequestSchedulingOpportunity
It seems like we are not putting absolute max-capacities on the individual queues when not-respecting-partitions. Describe why? Similarly, describe as to why user-limit-factor is ignored in the not-respecting-paritions mode.
- Test code comments from Vinod: *
Rename to testPreferenceOfNeedyPrioritiesUnderSameAppTowardsNodePartitions
Actually, now that I rename it that way, this may not be the right behavior. Not respecting priorities within an app can result in scheduling deadlocks:
This will not lead deadlock, because we separately count resource usage under each partition, priority=1 goes first on partition=y before priority=0 all satisifed only because priority=1 is the lowest priority asks for partition=y.
Renamed to testPreferenceOfQueuesTowardsNodePartitions
Let's move all these node-label related tests into their own test-case.
Moved to TestNodeLabelContainerAllocation
Add more tests:
1. Added testAMContainerAllocationWillAlwaysBeExclusive to make sure AM will be always excluisve.
2. Added testQueueMaxCapacitiesWillNotBeHonoredWhenNotRespectingExclusivity to make sure max-capacities on individual queues ignored when doing ignore exclusivity allocation
Merge queue#needResource and application#needResource
Some methods like canAssignToThisQueue where both nodeLabels and exclusiveType are passed, it may be simplified by passing the current partitionToAllocate to simplify the internal if/else check.
Actually, it will not simplify logic too much, I checked there're only few places can leverage nodePartitionToLookAt, I perfer to keep semantics of SchedulingMode
The following may be incorrect, as the current request may be not the AM container request, though null == rmAppAttempt.getMasterContainer()
I understand masterContainer could be async initialized in RMApp, but the interval could be ignored, doing the null check here can make sure AM container isn't get allocated.
below if/else can be avoided if passing the nodePartition into queueCapacities.getAbsoluteCapacity(nodePartition),
the second limit won’t be hit?
Yeah, it will not be hit, but set it to be "maxUserLimit" will enhance readability.
nonExclusiveSchedulingOpportunities#setCount -> add(Priority)
Attached new patch (ver.3)