Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.13.0, 1.14.0, 1.12.3
Description
In order to better clean up Zookeeper HA services, I suggest grouping job-specific services under a common jobs/<JobID> zNode. That way, it becomes trivial to clean up the job-specific Zookeeper data (simply deleting the jobs/<JobID> node.
Currently, our Zookeeper structure is not really structured well. The current layout looks like this:
clusterID -> jobgraphs -> <job-id> -> checkpoints -> <job-id> -> checkpoint-1 -> checkpoint-counter -> <job-id> -> counter -> leaderlatch -> dispatcher_lock -> resourc_emanager_lock -> <job-id> -> leader -> dispatcher_lock -> resource_manager_lock -> <job-id>
The new layout could look like this:
clusterID -> jobgraphs -> <job-id> -> jobs -> <job-id> -> checkpoints -> checkpoint-1 -> checkpoint_id_counter -> counter -> leader -> latch -> connection_info -> leader -> dispatcher -> latch -> connection_info -> resource_manager -> latch -> connection_info
Attachments
Issue Links
- causes
-
FLINK-22745 MesosWorkerStore is started with an illegal namespace
- Closed
-
FLINK-22784 Jepsen tests broken due to change in zNode layout
- Closed
- is related to
-
FLINK-20695 Zookeeper node under leader and leaderlatch is not deleted after job finished
- Closed
- links to