In the design doc of FLINK-14106, there was a discussion on how we map allocated containers with the requested ones on YARN. We rejected the design option that uses container priorities for mapping containers of different resources, because we do not want to priorities different container requests (which is the original purpose for this field). As a result, we have to interpret how the requested container request would be normalized by Yarn, and map the allocated/requested containers accordingly, which is complicated and fragile. See also
Recently in our POC for fine grained resource management, we surprisingly discovered that Yarn actually doesn't work with container requests same priority and different resources. I do not find this described as an official protocol in any Yarn's documents. The issue has been raised in early Yarn versions (YARN-314) and has not been fixed util Hadoop 2.9 when allocationRequestId is introduced. In Hadoop 2.8, Yarn scheduler is still internally using priority as the key of a container request (see AppSchedulingInfo#updateResourceRequests ), thus requests same priority and different resources would overwrite each other.
The new discovery suggests that, if we want to support containers with different resources on Hadoop 2.8 and earlier versions, we have to give them different priorities anyway. Thus, I would suggest to get rid of the container normalization simulation and go back to the previously rejected priority based design option.