Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.12.2, 1.13.0
Description
Looks like KubernetesLeaderRetrievalDriver is not closed even if the KubernetesLeaderElectionDriver is closed and job reach globally terminated.
This will lead to many configmap watching be still active with connections to K8s.
When the connections exceeds max concurrent requests, those new configmap watching can not be started. Finally leads to all new jobs submitted timeout.
fly_in_gis trohrmann This may be related to FLINK-20695, could you confirm this issue?
But when many jobs are running in same session cluster, the config map watching is required to be active. Maybe we should merge all config maps watching?
Attachments
Attachments
Issue Links
- is related to
-
FLINK-22006 Could not run more than 20 jobs in a native K8s session when K8s HA enabled
- Closed
-
FLINK-22054 Using a shared watcher for ConfigMap watching
- Closed
- relates to
-
FLINK-20695 Zookeeper node under leader and leaderlatch is not deleted after job finished
- Closed
- links to