-
Type:
Bug
-
Status: Patch Available
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 2.7.5, 3.1.1, 2.9.2
-
Fix Version/s: None
-
Component/s: applicationmaster
-
Labels:None
The resource manager may has allocated a map container on a host ("h1" for example) for a application, and the container has not been fetched by the MRAppMaster. At this time, the MRAppMaster receives a task fail event, and the task is on host h1. The event cause the h1 blacklisted. Now the MRAppMaster send a heartbeat, and receive a container on h1. The MRAppMaster can not assign the container since it is on a blacklisted host. The #getContainerReqToReplace fails returning another task, may cause a map task hang forever.