diff --git src/main/docbkx/ops_mgt.xml src/main/docbkx/ops_mgt.xml index c12f8d4..e531de0 100644 --- src/main/docbkx/ops_mgt.xml +++ src/main/docbkx/ops_mgt.xml @@ -869,6 +869,138 @@ false Cluster Replication See Cluster Replication. + +
Status + + This package is experimental quality software and is only meant to be a base + for future developments. The current implementation offers the following + features: + + + Master/Slave replication. + Master/Master replication. + Cyclic replication. + Replication of scoped families in user tables. + Start/stop replication stream. + Supports clusters of different sizes. + Handling of partitions longer than 10 minutes. + Ability to add/remove slave clusters at runtime. + MapReduce job to compare tables on two clusters + + + Please report bugs on the project's Jira when found. + +
+ +
+ Requirements + + Before trying out replication, make sure to review the following requirements: + + Zookeeper should be handled by yourself, not by HBase, and should + always be available during the deployment. + All machines from both clusters should be able to reach every + other machine since replication goes from any region server to any + other one on the slave cluster. That also includes the + Zookeeper clusters. + Both clusters should have the same HBase and Hadoop major revision. + For example, having 0.90.1 on the master and 0.90.0 on the slave is + correct but not 0.90.1 and 0.89.20100725. + Every table that contains families that are scoped for replication + should exist on every cluster with the exact same name, same for those + replicated families. + For multiple slaves, Master/Master, or cyclic replication version + 0.92 or greater is needed. + +
+ +
+ Deployment + + The following steps describe how to enable replication from a cluster + to another. + + + Edit ${HBASE_HOME}/conf/hbase-site.xml + on both cluster to add the following configurations: + + hbase.replication + true + + ]]> + + Deploy the files, and then restart HBase if it was running. + + + Run the following command in the master's shell while it's running + add_peer + This will show you the help to setup the replication stream between + both clusters. If both clusters use the same Zookeeper cluster, you have + to use a different + zookeeper.znode.parent since they can't + write in the same folder. + + + + Once you have a peer, you need to enable replication on your column families. + One way to do it is to alter the table and to set the scope like this: + + + disable 'your_table' + alter 'your_table', {NAME => 'family_name', REPLICATION_SCOPE => '1'} + enable 'your_table' + + + Currently, a scope of 0 (default) means that it won't be replicated and a + scope of 1 means it's going to be. In the future, different scope can be + used for routing policies. + + + + To list all configured peers run the following command in the master's shell: + + list_peers (as of version 0.92) + + + + + You can confirm that your setup works by looking at any region server's log + on the master cluster and look for the following lines; + + + Considering 1 rs, with ratio 0.1 + Getting 1 rs from peer cluster # 0 + Choosing peer 10.10.1.49:62020 + + + In this case it indicates that 1 region server from the slave cluster + was chosen for replication. +
+ +
+ Verifying Replicated Data + + Verifying the replicated data on two clusters is easy to do in the shell when + looking only at a few rows, but doing a systematic comparison requires more + computing power. This is why the VerifyReplication MR job was created, it has + to be run on the master cluster and needs to be provided with a peer id (the + one provided when establishing a replication stream) and a table name. Other + options let you specify a time range and specific families. This job's short + name is "verifyrep" and needs to be provided when pointing "hadoop jar" to the + hbase jar. + + + Another alternative to execute it is call through $HBASE_HOME/bin/hbase: + + + $HBASE_HOME/bin/hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication [options] [peerid] [tablename] + + To get more informations about it, please try: + + $HBASE_HOME/bin/hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication --help + +
HBase Backup