Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.14.1
-
None
-
None
Description
In order to fix MESOS-662, we enabled the OOM killer.
In addition, instead of setting the memory hard limit (memory.limit_in_bytes), we set the soft limit (memory.soft_limit_in_bytes) to the requested amount of memory and set the hard limit higher by a fixed amount. Once the soft limit is reached, this triggers a memory threshold notification at which point we capture the memory.stat information and treat the executor as having OOMed.
We've seen reports from users that this is not behaving the same as simply setting the hard limit. In particular, we've seen the file cache not being purged by the kernel upon hitting the soft limit (the kernel documentation only states that action is taken with the soft limit in the presence of system wide memory pressure). However, it was not clear over email and in the review as to the extent of which the hard limit and soft limit are treated differently in terms of purging cached memory: https://reviews.apache.org/r/14043/