The MasterDetector code still uses old Zookeeper abstractions and it's error-prone for maintenance and further development.
There exists newer and higher-level zookeeper group management abstractions such as zookeeper::Group. After this refactoring Mesos master detecting and contending logic should reside in separate classes which depend on general Zookeeper group leader detection and contention abstractions with Future-pattern APIs. Masters should then use the MasterContender while Slaves and Schedulers should use the MasterDetector.
The following is a summary of the change
Two layers of ZK abstractions are added:
- zookeeper::LeaderContender is responsible for contending to be a leader. It does not detect whether it is elected but only registers itself in ZK and watch the membership change.
zookeeper::LeaderDetector is responsible for detecting the current leader. The contender and detector run independent of each other.
- mesos::internal::MasterContender and mesos::internal::MasterDetector wrap around the zookeeper contender and detector abstractions as adapters to provide/interpret the ZooKeeper data (as Master PIDs, later as MasterInfo after
- MasterContender is used only by the master and MasterDetector is used by master, slave and scheduler driver. With the new abstractions the master election logic starts after the client components (master/slave/sched) are fully initialized (e.g. after slave recovery for slaves) instead of before they start.