Index: src/docbkx/book.xml
===================================================================
--- src/docbkx/book.xml (revision 1214377)
+++ src/docbkx/book.xml (working copy)
@@ -1554,6 +1554,8 @@
Periodically, and when there are not any regions in transition,
a load balancer will run and move regions around to balance cluster load.
See for configuring this property.
+ See for more information on region assignment.
+ CatalogJanitorPeriodically checks and cleans up the .META. table. See for more information on META.
@@ -1714,6 +1716,90 @@
+
+ Region-RegionServer Assignment
+ This section describes how Regions are assigned to RegionServers.
+
+
+
+ Startup
+ When HBase starts regions are assigned as follows (short version):
+
+
+
+ The Master invokes the AssignmentManager upon startup.
+
+
+ The AssignmentManager looks at the existing region assignments
+ in META.
+
+
+ If the region assignment is still valid (i.e., if the RegionServer) is still online
+ then the assignment is kept.
+
+
+
+ If the assignment is invalid, then the LoadBalancerFactory is invoked to assign the
+ region. The DefaultLoadBalancer will randomly assign the region to a RegionServer.
+
+
+
+
+
+
+
+ Failover
+ When a RegionServer fails (short version):
+
+
+
+ The regions immediately become unavailable because the RegionServer is down.
+
+
+ The Master will detect that the RegionServer has failed.
+
+
+ The region assignments will be considered invalid and will be re-assigned just
+ like the startup sequence.
+
+
+
+
+
+
+
+ Region Load Balancing
+
+ Regions can be periodically moved by the .
+
+
+
+
+
+
+ Region-RegionServer Locality
+ Over time, Region-RegionServer locality is achieved via the an aspect of
+ HDFS block replication. The HDFS client when choosing where to write it replicas,
+ by default does as follows:
+
+ First replica is written to local node
+
+ Second replica to another node in same rack
+
+ Third replica to a node in another rack (if sufficient nodes)
+
+
+ HBase eventually achieves locality for a region after a flush a compaction.
+ In a RegionServer failover situation a RegionServer may be assigned regions with non-local
+ StoreFiles (i.e., none of the replicas are local), however eventually as new data is written
+ in the region, or the table is compacted and StoreFiles are re-written, they will become "local"
+ to the RegionServer.
+
+ For more information, see HDFS Design on Replica Placement
+ and also Lars George's blog on HBase and HDFS locality.
+
+
+
Region Splits
@@ -1725,15 +1811,6 @@
splits (and for why you might do this)
-
- Region Load Balancing
-
-
- Regions can be periodically moved by the .
-
-
-
-
StoreA Store hosts a MemStore and 0 or more StoreFiles (HFiles). A Store corresponds to a column family for a table for a given region.
@@ -2729,13 +2806,15 @@
Getting The Most From Your HBase Install by Ryan Rawson, Jonathan Gray (Hadoop World 2009).
- Papers
+ HBase PapersBigTable by Google (2006).
+ HBase and HDFS Locality by Lars George (2010).
+ No Relation: The Mixed Blessings of Non-Relational Databases by Ian Varley (2009).
- Sites
+ HBase SitesCloudera's HBase Blog has a lot of links to useful HBase information.
CAP Confusion is a relevant entry for background information on
@@ -2746,10 +2825,15 @@
HBase Wiki has a page with a number of presentations.
- Books
+ HBase BooksHBase: The Definitive Guide by Lars George.
+ Hadoop Books
+ Hadoop: The Definitive Guide by Tom White.
+
+
+
HBase and the Apache Software Foundation