I'm excited to see this started – I'm quite interested in the multi-resource scheduling problem. After reading through the patch, I have a few questions for you; hopefully this feedback will be helpful.
First off, I want to confirm my understanding is correct: this patch is designed to allocate resources to jobs within the same capacity queue based on the DRF-inspired ordering of their need for resources. It is not designed to do weighted DRF for the complete cluster. If I'm mistaken, perhaps some of my feedback my not apply.
1) Are you planning to change the definition of a queue's capacity? Currently, it is defined as a fractional percentage of the parent queue's total memory. Alternatively, queues could be specified with a fractional percentage of each resource. eg, I could have one queue with "75% CPU and 50% RAM" and a second with "25% CPU and 50% RAM".
2) Do you plan to change how spare capacity is allocated? My understanding is that it's currently shared proportionally, based on the queue capacities, an approach seems like it would be intuitive for cluster operators. With a multi-resource setup however, running DRF on the pool of spare resources would provide higher utilization. (I can provide an example of this if you'd like.)
3) Are you planning to support priorities or weights within the queues? IIRC, this was supported in the MR1 scheduler, and the DRF paper describes a weighted extension.
4) Lastly, with the increasing flexibility of the YARN scheduler, I think it makes sense to better support heterogenous clusters. Currently, yarn.nodemanager.resource.memory-mb is a constant across the cluster, but with a scheduler capable of packing differently shaped resource containers onto each node, heterogenous nodes would be a natural extension. (This is more of an observation than a question.
Looking forward to further discussions.