Currently, both the Yarn ApplicationMaster and task containers are launched with a JVM heap size matching the total memory available to the container. However, while the heap is indeed one of the major users of memory in the JVM, other components also occupy memory which is not included in the heap calculation: http://stackoverflow.com/questions/9725633/why-is-my-jvms-total-memory-usage-more-than-30-times-greater-than-its-xmx-value.
With current code, if we attempt to launch containers with limited memory
(e.g 512MB) and leave the default yarn.nodemanager.vmem-pmem-ratio of 2.1, it becomes very easy to overflow the maximum amount of virtual memory and get failed application executions even with simple applications such as the SimpleShortestPaths example. While we could force users to set higher vmem-pmem ratios, I think a better option would be to code a configurable margin/fraction of heap usage. In particular, I've always had the habit of setting heap usage at 75% of the total memory available to the container. Doing this, I have had no problems with excessive virtual memory.