[ZOOKEEPER-1032] speed up recovery from leader failure - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: server
Labels:
None

Description

when the number of nodes is large, it may take a long time to recover from leader failure
there are some points to improve:

1. Follower should take snapshot asynchronously when follower up to date

2. Currently Leader/Follower will clear the DataTree on leader failures, and then restore it from a snapshot and transaction logs. DataTree should not be cleared, only restore it from transaction logs.

3. FileTxnLog should store recently transaction logs in memory, so when DataTree is not behind the transaction logs a lot, the transaction logs in memory can be used to restore DataTree.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: jiangwen wei

Votes:: 1 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 27/Mar/11 10:03

Updated:: 03/Feb/22 08:50