[ZOOKEEPER-3114] Built-in data consistency check inside ZooKeeper - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.6.0
Component/s: quorum
Labels:
None

Description

The correctness of ZooKeeper was kind of proved in theory in ZAB paper, but the implementation is a bit different from the paper, for example, save the currentEpoch and proposals/commits upon to NEWLEADER is not atomic in current implementation, so the correctness of ZooKeeper is not actually proved in reality.

Also bugs could be introduced during implementation, issues like sending NEWLEADER packet too early reported in ~~ZOOKEEPER-3104~~ might be there since the beginning (didn't check exactly when this was introduced).

More correctness issues were introduced when adding new features, like on disk txn sync, local session, retain database, etc, both of these features added inconsistency bugs on production.

To catch the consistency issue earlier, internally, we're running external consistency checkers to compare nodes (digest), but that's not efficient (slow and expensive) and there are corner cases we cannot cover in external checker. For example, we don't know the last zxid before epoch change, which makes it's impossible to check it's missing txn or not. Another challenge is the false negative which is hard to avoid due to fuzzy snapshot or expected txn gap during snapshot syncing, etc.

This Jira is going to propose a built-in real time consistency check by calculating the digest of DataTree after applying each txn, and sending it over to learner during propose time so that it can verify the correctness in real time.

The consistency check will cover all phases, including loading time during startup, syncing, and broadcasting. It can help us avoid data lost or data corrupt due to bad disk and catch bugs in code.

The protocol change will make backward compatible to make sure we can enable/disable this feature transparently.

As for performance impact, based on our testing, it will add a bit overhead during runtime, but doesn't have obvious impact in general.

Attachments

Sub-Tasks

Data integrity check when loading snapshot/txns from disk

Resolved

Fangmin Lv

100%

Real time data integrity check during broadcast time

Resolved

Fangmin Lv

100%

Activity

People

Assignee:: Fangmin Lv

Reporter:: Fangmin Lv

Votes:: 2 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 08/Aug/18 01:02

Updated:: 27/Dec/19 21:36

Resolved:: 27/Dec/19 21:36

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

18.5h

Include sub-tasks