Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
Sometimes, NodeManger will send large container statuses to ResourceManager when NodeManger start with recovering, as a result , NodeManger will be failed to start because of oom.
In my case, the large container statuses size is 135M, which contain 11 container statuses, and I find the diagnostics of 5 containers are very large(27M), so, I truncate the container diagnostics as the patch.
Attachments
Attachments
Issue Links
- Is contained by
-
YARN-3998 Add support in the NodeManager to re-launch containers
- Resolved