Details
-
New Feature
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
We do have leadership transfer, but it only happen when we are removing leader in reconfiguration. It would be nice to support it with dedicated API. This way it will be really useful to reduce unavailability during rolling upgrade or leader shutdown.
Also, I think it cloud also help zxid rollover. Inheriting leadership in rollover should be similar to leadership transfer in protocol.
https://www.usenix.org/conference/atc12/technical-sessions/presentation/shraer
we investigate the effect of
reconfigurations removing the leader. Note that a server
can never be added to a cluster as leader as we always
prioritize the current leader. Figure 8 shows the advan-
tage of designating a new leader when removing the cur-
rent one, and thus avoiding leader election. It depicts
the average time to recover from a leader crash versus
the average time to regain system availability following
the removal of the leader. The average is taken on 10
executions. We can see that designating a default leader
saves up to 1sec, depending on the cluster size. As cluster
size increases, leader election takes longer while using a
default leader takes constant time regardless of the clus-
ter size. Nevertheless, as the figure shows, cluster size
always affects total leader recovery time, as it includes
synchronizing state with a quorum of followers.
Attachments
Issue Links
- relates to
-
ZOOKEEPER-2789 Reassign `ZXID` for solving 32bit overflow problem
- Open
-
ZOOKEEPER-1277 servers stop serving when lower 32bits of zxid roll over
- Resolved