Type: New Feature
Affects Version/s: None
Fix Version/s: None
HBase should consider supporting a federated deployment where someone might have terascale (or beyond) clusters in more than one geography and would want the system to handle replication between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync between replicas out of the box.
Consider if rows, columns, or even cells could be scoped: local, or global.
Then, consider a background task on each cluster that replicates new globally scoped edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps, multiversioning) such that simple exchange of globally scoped cells would be conflict free and would "just work". Implementation effort here would be in producing an efficient mechanism for collecting up edits from all the HRS and transmitting the edits over the network to peers where they would then be split out to the HRS there. Holding on to the edit trace and tracking it until the remote commits succeed would also be necessary. So, HLog is probably the right place to set up the tee. This would be filtered log shipping, basically.
This proposal does not consider transactional tables. For transactional tables, enforcement of global mutation commit ordering would come into the picture if the user wants the transaction to span the federation. This should be an optional feature even with transactional tables themselves being optional because of how slow it would be.
|1.||Pluggable replication framework||Open||Unassigned|
|2.||Record log region splits and region moves in the HLog||Open|