Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.23.10, 2.4.0
-
None
-
Reviewed
Description
It would be nice if the RM supported blacklisting a node for an AM launch after the same node fails a configurable number of AM attempts. This would be similar to the blacklisting support for scheduling task attempts in the MapReduce AM but for scheduling AM attempts on the RM side.
Attachments
Attachments
Issue Links
- duplicates
-
MAPREDUCE-6511 MRAppMaster second attempt starting on the same node as a previously failed MRAppMaster attempt
- Resolved
-
YARN-2293 Scoring for NMs to identify a better candidate to launch AMs
- Resolved
-
YARN-3744 ResourceManager should avoid allocating AM to same node repeatedly in case of AM launch failures
- Resolved
- is depended upon by
-
YARN-896 Roll up for long-lived services in YARN
- Open
- is duplicated by
-
YARN-4217 Failed AM attempt retries on same failed host
- Resolved
-
YARN-8352 AM should retry on a different node after the previous application attempt fail
- Resolved
- is related to
-
YARN-3994 RM should respect AM resource/placement constraints
- Open
-
YARN-4837 User facing aspects of 'AM blacklisting' feature need fixing
- Resolved
-
YARN-3803 Application hangs after more then one localization attempt fails on the same NM
- Resolved
-
YARN-4576 Enhancement for tracking Blacklist in AM Launching
- Open
-
YARN-1073 NM to recognise when it can't spawn process and stop accepting containers
- Open
- relates to
-
YARN-4247 Deadlock in FSAppAttempt and RMAppAttemptImpl causes RM to stop processing events
- Resolved
-
YARN-4685 Disable AM blacklisting by default to mitigate situations that application get hanged
- Resolved
-
YARN-4284 condition for AM blacklisting is too narrow
- Resolved
-
YARN-4670 add logging when a node is AM-blacklisted
- Open
-
YARN-964 Give a parameter that can set AM retry interval
- Resolved