Cancelling the patch because there's going to be a problem with ConcurrentModificationException. Since the scheduler keys are not in a concurrent map, anything that tries to iterate it while entries are removed without that iterator is going to be a problem. So when the scheduler loop iterates the keys, the container allocation could remove a scheduler key and cause the CME. Either the scheduler loop needs to be the one that removes the key, we need to be iterating a copy (undesirable), or the key collection needs to support concurrent modification.
I guess the TODO should be moved
Good point. I'll fix that in the next version of the patch.
It is in my opinion leveraging what is an un-documented API (the fact that queue demand is updated only with the ANY request).
The only documentation of the YARN allocation protocol for a while was the MapReduce AM code, and that code leveraged this fact well before
MAPREDUCE-5583. Asking for a single container on either rack1/host1, rack2/host2, or rack3/host3 doesn't allocate three containers, it allocates one only because the ANY request is 1. Also looking at the core of the RM schedulers, it's always been about the ANY request for whether or not applications get resources. Lots of people based their early custom apps on MapReduce AM code or by looking at the YARN scheduler code, so I don't think we can change that behavior without risking breaking those apps. It's also interesting that one of the designers of the YARN allocation protocol suggested the ANY "hack" as the way forward on MAPREDUCE-5583. (See this comment.)
One way to do this might be to leverage the YARN Reservation System
Interesting idea, but that would limit the resources for the entire app not just the requested phases (i.e.: user often want to limit maps but not reduces or vice-versa).
Looks like the
YARN-1651 does the opposite as well...
YARN-1651 is about updating existing allocations for specific existing containers rather than new allocations. It doesn't have the concept of rack/ANY like the allocate protocol. Or am I missing something here?