Details
-
Bug
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
2.7.5, 3.1.1, 2.9.2
-
None
-
None
Description
The resource manager may has allocated a map container on a host ("h1" for example) for a application, and the container has not been fetched by the MRAppMaster. At this time, the MRAppMaster receives a task fail event, and the task is on host h1. The event cause the h1 blacklisted. Now the MRAppMaster send a heartbeat, and receive a container on h1. The MRAppMaster can not assign the container since it is on a blacklisted host. The #getContainerReqToReplace fails returning another task, may cause a map task hang forever.