Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.10.1, 3.3.3
-
None
-
None
Description
All RMs in HA are stuck in standby when the ZK connection held by the active RM is disconnected.
2023-02-22 13:08:19,832 INFO org.apache.hadoop.ha.ActiveStandbyElector (main-EventThread): Session disconnected. Entering neutral mode...
2023-02-22 13:08:19,832 WARN org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService (main-EventThread): Lost contact with Zookeeper. Transitioning to standby in 10000 ms if connection is not reestablished.
Repro:
Send a Disconnected Event to the Active RM using below code.
zkConnectionState = ConnectionState.DISCONNECTED; enterNeutralMode();