Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.6.0
-
None
Description
Currently in the implementation of Standalone Master HA, if application is submitted as cluster mode, the resource (CPU cores and memory) of driver is not counted again when recovered from failure, which will lead to unexpected behaviors, like more than expected executors, negative core and memory usage in the web UI. Also the recovered application's state is always WAITING, we have to change the state to RUNNING when fully recovered.
Attachments
Issue Links
- is duplicated by
-
SPARK-21169 Spark HA: Jobs state is in WAITING status after reconnecting to standby master
- Resolved
-
SPARK-18554 leader master lost the leadership, when the slave become master, the perivious app's state display as waitting
- Resolved
-
SPARK-20058 the running application status changed from running to waiting when a master is down and it change to another standy by master
- Resolved
- links to