Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
Scenario
=======
1. RM HA and 5 NMs available in cluster and are working fine
2. Add one more NM to the same cluster but RM /etc/hosts not updated.
3. Submit application to the same cluster
If Am get allocated to the newly added NM the application attempt will get stuck for ever.User will not get to know why the same happened.
Impact
1.RM logs gets overloaded with exception
2.Application gets stuck for ever.
Handling suggestion YARN-261 allows for Fail application attempt .
If we fail the same next attempt could get assigned to another NM.