The TaskTracker currently supports killing tasks if the virtual memory of a task exceeds a set of configured thresholds. I would like to extend this feature to enable killing tasks if the physical memory used by that task exceeds a certain threshold.
On a certain operating system (guess?), if user space processes start using lots of memory, the machine hangs and dies quickly. This means that we would like to prevent map-reduce jobs from triggering this condition. From my understanding, the killing-based-on-virtual-memory-limits (
HADOOP-5883) were designed to address this problem. This works well when most map-reduce jobs are Java jobs and have well-defined -Xmx parameters that specify the max virtual memory for each task. On the other hand, if each task forks off mappers/reducers written in other languages (python/php, etc), the total virtual memory usage of the process-subtree varies greatly. In these cases, it is better to use kill-tasks-using-physical-memory-limits.