Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
2.6.2
-
None
Description
Total resource count mistake:
NodeRemovedSchedulerEvent in ReconnectNodeTransition will reduce the newNode.getTotalCapability() in Multi-thread model. Since the RMNode and scheduler in different queue. So it cannot guarantee the remove-update-add operation in sequence. Sometimes the total resource will reduce the newNode.getTotalCapability() when handling NodeRemovedSchedulerEvent.
Attachments
Attachments
Issue Links
- is related to
-
YARN-2561 MR job client cannot reconnect to AM after NM restart.
- Closed