Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-22636

Group job specific ZooKeeper HA services under common jobs/<JobID> zNode

    XMLWordPrintableJSON

Details

    • Hide
      The ZooKeeper job-specific HA services are now grouped under a zNode with the respective `JobID`. Moreover, the config options `high-availability.zookeeper.path.latch`, `high-availability.zookeeper.path.leader`, `high-availability.zookeeper.path.checkpoints` and `high-availability.zookeeper.path.checkpoint-counter` have been removed and, thus, have no longer an effect.
      Show
      The ZooKeeper job-specific HA services are now grouped under a zNode with the respective `JobID`. Moreover, the config options `high-availability.zookeeper.path.latch`, `high-availability.zookeeper.path.leader`, `high-availability.zookeeper.path.checkpoints` and `high-availability.zookeeper.path.checkpoint-counter` have been removed and, thus, have no longer an effect.

    Description

      In order to better clean up Zookeeper HA services, I suggest grouping job-specific services under a common jobs/<JobID> zNode. That way, it becomes trivial to clean up the job-specific Zookeeper data (simply deleting the jobs/<JobID> node.

      Currently, our Zookeeper structure is not really structured well. The current layout looks like this:

      clusterID -> jobgraphs -> <job-id>
                -> checkpoints -> <job-id> -> checkpoint-1
                -> checkpoint-counter -> <job-id> -> counter
                -> leaderlatch -> dispatcher_lock
                               -> resourc_emanager_lock
                               -> <job-id>
                -> leader -> dispatcher_lock
                          -> resource_manager_lock
                          -> <job-id>
      

      The new layout could look like this:

      clusterID -> jobgraphs -> <job-id>
                -> jobs -> <job-id> -> checkpoints -> checkpoint-1
                                    -> checkpoint_id_counter -> counter
                                    -> leader -> latch
                                              -> connection_info
                -> leader -> dispatcher -> latch
                                        -> connection_info
                          -> resource_manager -> latch
                                              -> connection_info
      

      Attachments

        Issue Links

          Activity

            People

              trohrmann Till Rohrmann
              trohrmann Till Rohrmann
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: