This is the first step of enabling on disk storage engine for ZooKeeper by extending the existing Snap interface and implement a RocksDB backed snapshot. Comparing to file based snapshot, RocksDB based snapshot is superior for big in memory data tree as it supports incremental snapshot by only serializing the changed data between snapshots.
High level overview:
- Extend Snap interface so every thing that's need serialize has a presence on the interface.
- Implement RocksDB based snapshot, and bidirectional conversations between File based snapshot and RocksDB snapshot, for back / forward compatibility.
- Change data capture is implemented by buffering transactions applied to data tree, and applied to RocksDB when processing each transaction. An incremental snapshot thus only requires RocksDB flush. ZK will always do a full snapshot when first loading the data tree during the start process.
- By default, this feature is disabled. Users need opt in by explicitly specify a Java system property to instantiate RocksDBSnap at runtime.
This work is based on top of the patch attached to ZOOKEEPER-3783 (kudos to Fangmin and co at FB), with some bug / test fixes and adjustment so it can cleanly apply to master branch.