Details
Description
The ApplicationState.MAX_NUM_RETRY setting, which controls the maximum number of back-to-back executor failures that the standalone cluster manager will tolerate before removing a faulty application, is currently a hardcoded constant (10), but there are use-cases for making it configurable (TBD in my PR). We should add a new configuration key to let users customize this.
Attachments
Issue Links
- duplicates
-
SPARK-2424 ApplicationState.MAX_NUM_RETRY should be configurable
- Resolved
- links to