Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
0.23.2
-
None
-
None
Description
Application will retry "yarn.resourcemanager.am.max-retries" times before the job is failed,if the MRAppmaster process is getting killed continously.This killed count is considered for Pending applications with -ve value on CLuster metrics page.
This will mis-interpret the exact number of jobs in the Pending state for the cluster.Even if the MRAppmaster kill count is monitored:should be done at the job level and not at the cluster level.