Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal exception to crash RM down. As shown in YARN-1924, it could due to RM HA internal bug itself, but not fatal exception. We should retrospect some decision here as HA feature is designed to protect key component but not disturb it.
Attachments
Attachments
Issue Links
- is related to
-
YARN-4118 Newly submitted app maybe stuck at saving state if store operation failure is ignored in ZKRMStateStore
- Open