diff --git src/main/docbkx/book.xml src/main/docbkx/book.xml
index 19dd770..0c2b2c1 100644
--- src/main/docbkx/book.xml
+++ src/main/docbkx/book.xml
@@ -3081,6 +3081,88 @@ myHtd.setValue(HTableDescriptor.SPLIT_POLICY, MyCustomSplitPolicy.class.getName(
+
+ Manual Region Splitting
+ It is possible to manually split your table into regions, either at table creation
+ (pre-splitting), or at a later time by altering the table. You might choose to split your
+ region for one or more of the following reasons. There may be other valid reasons, but the
+ need to manually split your table might also point to problems with your schema
+ design.
+
+ Reasons to Manually Split Your Table
+
+ Your data is sorted by timeseries or another similar algorithm that sorts new data
+ at the end of the table. This means that the Region Server holding the last region is
+ always under load, and the other Region Servers are idle, or mostly idle.
+
+
+ You have developed an unexpected hotspot in one region of your table. For
+ instance, an application which tracks web searches might be inundated by a lot of
+ searches for a celebrity in the event of news about that celebrity.
+
+
+ After a big increase to the number of Region Servers in your cluster, to get the
+ load spread out quickly.
+
+
+ Before a bulk-load which is likely to cause unusual and uneven load across
+ regions.
+
+
+
+ Determining Split Points
+ The goal of splitting your table manually is to improve the chances of balancing the
+ load across the cluster in situations where good rowkey design alone won't get you
+ there. Keeping that in mind, the way you split your regions is very dependent upon the
+ characteristics of your data. It may be that you already know the best way to split your
+ table. If not, the way you split your table depends on what your keys are like.
+
+
+ Alphanumeric Rowkeys
+
+ If your rowkeys start with a letter or number, you can split your table at
+ letter or number boundaries. For instance, the following command creates a table
+ with regions that split at each vowel, so the first region has A-D, the second
+ region has E-H, the third region has I-N, the fourth region has O-V, and the fifth
+ region has U-Z.
+ hbase> create 'test_table', 'f1', SPLITS=> ['a', 'e', 'i', 'o', 'u']
+ The following command splits an existing table at split point '2'.
+ hbase> split 'test_table', '2'
+ You can also split a specific region by referring to its ID. You can find the
+ region ID by looking at either the table or region in the Web UI. It will be a
+ long number such as
+ t2,1,1410227759524.829850c6eaba1acc689480acd8f081bd.. The
+ format is table_name,start_key,region_idTo split that
+ region into two, as close to equally as possible (at the nearest row boundary),
+ issue the following command.
+ hbase> split 't2,1,1410227759524.829850c6eaba1acc689480acd8f081bd.'
+ The split key is optional. If it is omitted, the table or region is split in
+ half.
+ The following example shows how to use the RegionSplitter to create 10
+ regions, split at hexadecimal values.
+ hbase org.apache.hadoop.hbase.util.RegionSplitter test_table HexStringSplit -c 10 -f f1
+
+
+
+ Using a Custom Algorithm
+
+ The RegionSplitter tool is provided with HBase, and uses a SplitAlgorithm to determine split points for you. As
+ parameters, you give it the algorithm, desired number of regions, and column
+ families. It includes two split algorithms. The first is the HexStringSplit algorithm, which assumes the row keys are
+ hexadecimal strings. The second, UniformSplit, assumes the row keys are random byte arrays. You will
+ probably need to develop your own SplitAlgorithm, using the provided ones as
+ models.
+
+
+
+
+ Online Region Merges