[ZOOKEEPER-1413] Use on-disk transaction log for learner sync up - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Minor
Resolution: Fixed
Affects Version/s: 3.4.3
Fix Version/s: 3.5.0
Component/s: server
Labels:
- performance
- quorum

Description

Motivation:
The learner syncs up with leader by retrieving committed log from the leader. Currently, the leader only keeps 500 entries of recently committed log in memory. If the learner falls behind more than 500 updates, the leader will send the entire snapshot to the learner.

With the size of the snapshot for some of our Zookeeper deployments (~10G), it is prohibitively expensive to send the entire snapshot over network. Additionally, our Zookeeper may serve more than 4K updates per seconds. As a result, a network hiccups for less than a second will cause the learner to use snapshot transfer.

Design:
Instead of looking only at committed log in memory, the leader will also look at transaction log on disk. The amount of transaction log kept on disk is configurable and the current default is 100k. This will allow Zookeeper to tolerate longer temporal network failure before initiating the snapshot transfer.

Implementation:
We plan to add interface to the persistence layer will can be use to retrieve proposals from on-disk transaction log. These proposals can then be used to send to the learner using existing protocol.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

ZOOKEEPER-1413-3.4.patch
14/Oct/13 06:47
88 kB
Germán Blanco
ZOOKEEPER-1413-3.4.patch
14/Oct/13 06:28
61 kB
Germán Blanco
ZOOKEEPER-1413.patch
28/Jun/13 03:27
88 kB
Thawan Kooburat
ZOOKEEPER-1413.patch
24/Jun/13 06:56
87 kB
Thawan Kooburat
ZOOKEEPER-1413.patch
21/Jun/13 21:22
87 kB
Thawan Kooburat
ZOOKEEPER-1413.patch
21/Jun/13 04:02
87 kB
Thawan Kooburat
ZOOKEEPER-1413.patch
18/May/13 03:03
87 kB
Thawan Kooburat
ZOOKEEPER-1413.patch
18/May/13 01:28
70 kB
Thawan Kooburat
ZOOKEEPER-1413.patch
21/Mar/12 17:29
23 kB
Thawan Kooburat

Issue Links

contains

ZOOKEEPER-876 Unnecessary snapshot transfers between new leader and followers

Resolved

is depended upon by

ZOOKEEPER-1709 Limit the size of txnlog file

Open

ZOOKEEPER-1710 Leader should not use txnlog for synchronization if txnlog is corrupted or missing

Open

is required by

ZOOKEEPER-1777 Missing ephemeral nodes in one of the members of the ensemble

Open

Activity

People

Assignee:: Thawan Kooburat

Reporter:: Thawan Kooburat

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Dates

Created:: 13/Mar/12 21:42

Updated:: 14/Oct/13 15:55

Resolved:: 01/Jul/13 17:22