Hi Sandy Ryza,
Thanks for your comments, actually I have read QueuePlacementPolicy/QueuePlacementRule from FS before working on this patch. The generic design fo this patch is based on FS's queue placement policy structure, but also with some changes.
To your comments:
Is a common way of configuration proposed?
No common configuration, it only defined a set of common interfaces. Since FS/CS have very different ways to configuration, now rules are created by different schedulers, see CapacityScheduler#updatePlacementRules as an example.
What steps are required for the Fair Scheduler to integrate with this?
1) Port existing rules to new APIs defined in the patch, this should be simple
2) Change configuration implementation to instance new defined PlacementRule, you may not need to change existing configuration items itself.
3) Change FS workflow, with this patch, queue mapping is happened before submit to scheduler. Remove queue mapping related logics from FS and create queue if needed.
Each placement rule gets the chance to assign the app to a queue, reject the app, or pass. If it passes, the next rule gets a chance.
New APIs are very similar:
Non-null is determined
Null is not determined
Throw exception when rejected.
You can take a look at org.apache.hadoop.yarn.server.resourcemanager.placement.PlacementRule
A placement rule can base its decision on:
Yes you can do all of them with the new API except "The set of queues given in the Fair Scheduler configuration.":
I was thinking necessarity of passing set of queues in the interface. In existing implementations of QueuePlacementPolicy, FS queues are only used to check mapped queue's existence. I would prefer to delay the check to submit to scheduler. See my next comment about "create" flag for more details.
Another reason of not passing queue names set via interface is, queues are very dynamic. For example, if user wants to submit application to queue with lowest utilization, queue names set may not enough. I would prefer to let rule choose to get what need from scheduler.
Rules are marked as "terminal" if they will never pass. This helps to avoid misconfigurations where users place rules after terminal rules.
I'm not sure if is it necessary. I think terminal or not should be determined by runtime, but I'm OK if you think it's must to have.
Rules have a "create" attribute which determines whether they can create a new queue or whether they must assign to existing queues.
I think queue is create-able or not should be determined by scheduler, it should be a part of scheduler configuration instead of rule itself.
You can put "create" to your implemented rules without any issue, but I prefer not to expose it to public interface.
Currently the set of placement rules is limited to what's implemented in YARN. I.e. there's no public pluggable rule support.
Agree, this is one thing we need to do in the future. For now, we can make queue mapping happens in a central place first.
Are there places where my summary would not describe what's going on in this patch?
I think it should covers most of my patch, you can also take a look at my patch to see if anything unexpected .