Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
Attempts can fail due to a large number of user errors and they should not be retried unnecessarily. The only reason YARN should retry an attempt is when the hardware fails or YARN has an error. NM failing, lost NM and NM disk errors are the hardware errors that come to mind.
Attachments
Attachments
Issue Links
- is depended upon by
-
YARN-896 Roll up for long-lived services in YARN
- Open
- is related to
-
YARN-891 Store completed application information in RM state store
- Closed
-
YARN-2074 Preemption of AM containers shouldn't count towards AM failures
- Closed
-
YARN-2355 MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container
- Resolved
-
TEZ-3426 Second AM attempt launched for session mode and recovery disabled for certain cases
- Closed
-
YARN-542 Change the default global AM max-attempts value to be not one
- Closed
- relates to
-
YARN-611 Add an AM retry count reset window to YARN RM
- Closed