Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-21942

KubernetesLeaderRetrievalDriver not closed after terminated which lead to connection leak

    XMLWordPrintableJSON

    Details

      Description

      Looks like KubernetesLeaderRetrievalDriver is not closed even if the KubernetesLeaderElectionDriver is closed and job reach globally terminated.
      This will lead to many configmap watching be still active with connections to K8s.

      When the connections exceeds max concurrent requests, those new configmap watching can not be started. Finally leads to all new jobs submitted timeout.

      Yang Wang Till Rohrmann This may be related to FLINK-20695, could you confirm this issue?
      But when many jobs are running in same session cluster, the config map watching is required to be active. Maybe we should merge all config maps watching?

        Attachments

        1. image-2021-03-24-18-08-30-196.png
          360 kB
          Yi Tang
        2. image-2021-03-24-18-08-42-116.png
          363 kB
          Yi Tang
        3. jstack.l
          303 kB
          Yi Tang

          Issue Links

            Activity

              People

              • Assignee:
                fly_in_gis Yang Wang
                Reporter:
                yittg Yi Tang
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: