Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
Currently, the RM has no way of returning requests that cannot be met. e.g. if the app wants a specific node and that node dies, then the RM should return that request instead of holding onto to it indefinitely.
Some situations in which this would be useful are:
- After
YARN-392, requests are location specific, and the locations that were requested are no longer in the cluster. - A high memory machine is lost, and resource requests above certain sizes are no longer able to be satisfied anywhere.
- All nodes in the cluster become unavailable.
At these points, there is no way the RM can inform the apps about its inability to allocate requests.