Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-33053

Watcher leak in Zookeeper HA mode

    XMLWordPrintableJSON

Details

    Description

      We observe a watcher leak in our OLAP stress test when enabling Zookeeper HA mode. TM's watches on the leader of JobMaster has not been stopped after job finished.

      Here is how we re-produce this issue:

      • Start a session cluster and enable Zookeeper HA mode.
      • Continuously and concurrently submit short queries, e.g. WordCount to the cluster.
      • echo -n wchp | nc {zk host} {zk port} to get current watches.

      We can see a lot of watches on /flink/{cluster_name}/leader/{job_id}/connection_info.

      Attachments

        1. taskmanager_flink-native-test-117-taskmanager-1-9_thread_dump (1).json
          157 kB
          Yangze Guo
        2. 26.dump.zip
          42.01 MB
          Yangze Guo
        3. 26.log
          50 kB
          Yangze Guo

        Issue Links

          Activity

            People

              guoyangze Yangze Guo
              guoyangze Yangze Guo
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: