YARN-6050 fixed the YARN API to allow multiple ResourceRequest's when submitting an AM so that you can actually do rack or node locality. We should allow MapReduce users to take advantage of this by exposing this functionality in some way. The raw YARN API allows for a lot of flexibility (e.g. different resources per request, etc), but we don't necessarily want to allow the user to do too much here so they don't shoot themselves in the foot and we don't make this overly complicated.
I propose we allow users to specify racks and nodes for strict locality. This would allow users to restrict an MR AM to specific racks and/or nodes. We could add a new property, mapreduce.job.am.resource-request.strict.locality, which takes a comma-separated list of entries like:
- <node> (assumes /default-rack)
MapReduce would then use this information to create the corresponding ResourceRequest's.
For example, mapreduce.job.am.resource-request.strict.locality=/rack1/node1 would create the following ResourceRequest's:
- resourceName=ANY, relaxLocality=false, capability=<X,Y>
- resourceName=/rack1, relaxLocality=false, capability=<X,Y>
- resourceName=node1, relaxLocality=true, capability=<X,Y>
By default, the property would be unset, and you'd get the normal ANY ResourceRequest.