Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
2.1.0-beta
-
None
-
None
-
OS/X with not enough file handles
Description
when creating too many containers with a claimed resource use of 0 RAM or vCores, the NM got to the state where exec() was continually failing -but nothing seemed to recognise this and blacklist the node.
Something should be noting that all container launches for an app/container are failing and do something. While AMs can/should code this, NM failure is something at the YARN-level
Attachments
Issue Links
- relates to
-
YARN-2005 Blacklisting support for scheduling AMs
- Resolved