Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Duplicate
-
None
-
None
-
None
-
None
Description
Application with large size Application submission context could cause store to Zookeeper failure due to znode size limit. Zookeeper znode limit exception thrown org.apache.zookeeper.KeeperException$ConnectionLossException. ZkStateStore will retry for configured times and will throw ConnectionLossException after configured limit.
Which could cause Resource manager to switch from active To StandBy and other application submitted not getting save to ZK.
Solution ApplicationStateData size to be validated before saving and reject application so that ResourceManager is not impacted.
Attachments
Issue Links
- is duplicated by
-
YARN-5006 ResourceManager quit due to ApplicationStateData exceed the limit size of znode in zk
- Resolved