Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
Today, the global AM max-attempts is set to 1 which is a bad choice. AM max-attempts accounts for both AM level failures as well as container crashes due to localization issue, lost nodes etc. To account for AM crashes due to problems that are not caused by user code, mainly lost nodes, we want to give AMs some retires.
I propose we change it to atleast two. Can change it to 4 to match other retry-configs.
Attachments
Attachments
Issue Links
- is related to
-
MAPREDUCE-5145 Change default max-attempts to be more than one for MR jobs as well
- Closed