Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-1032

speed up recovery from leader failure

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • server
    • None

    Description

      when the number of nodes is large, it may take a long time to recover from leader failure
      there are some points to improve:

      1. Follower should take snapshot asynchronously when follower up to date

      2. Currently Leader/Follower will clear the DataTree on leader failures, and then restore it from a snapshot and transaction logs. DataTree should not be cleared, only restore it from transaction logs.

      3. FileTxnLog should store recently transaction logs in memory, so when DataTree is not behind the transaction logs a lot, the transaction logs in memory can be used to restore DataTree.

      Attachments

        Activity

          People

            Unassigned Unassigned
            wjiangwen jiangwen wei
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: