Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
If container recovery takes an excessive amount of time (e.g.: HDFS is slow) then the NM could start servicing requests before all containers have recovered. If an AM tries to kill a container while it is still recovering then this kill request could be lost.
Attachments
Issue Links
- relates to
-
YARN-4051 ContainerKillEvent lost when container is still recovering and application finishes
- Resolved