Details
-
New Feature
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
2.7.1
-
None
-
None
Description
An unmanaged container / leaked container is a container which is no longer managed by NM. Thus, it is cannot be managed / leaked by YARN, too.
There are many cases a YARN managed container can become unmanaged, such as:
- NM service is disabled or removed on the node.
- NM is unable to start up again on the node, such as depended configuration, or resources cannot be ready.
- NM local leveldb store is corrupted or lost, such as bad disk sectors.
- NM has bugs, such as wrongly mark live container as complete.
Note, they are caused or things become worse if work-preserving NM restart enabled, see YARN-1336
Bad impacts of unmanaged container, such as:
- Resource cannot be managed for YARN on the node:
- Cause YARN on the node resource leak
- Cannot kill the container to release YARN resource on the node to free up resource for other urgent computations on the node.
- Container and App killing is not eventually consistent for App user:
- App which has bugs can still produce bad impacts to outside even if the App is killed for a long time