Index: src/docbkx/book.xml =================================================================== --- src/docbkx/book.xml (revision 1214377) +++ src/docbkx/book.xml (working copy) @@ -1554,6 +1554,8 @@ Periodically, and when there are not any regions in transition, a load balancer will run and move regions around to balance cluster load. See for configuring this property. + See for more information on region assignment. +
CatalogJanitor Periodically checks and cleans up the .META. table. See for more information on META. @@ -1714,6 +1716,90 @@
+
+ Region-RegionServer Assignment + This section describes how Regions are assigned to RegionServers. + + +
+ Startup + When HBase starts regions are assigned as follows (short version): + + + + The Master invokes the AssignmentManager upon startup. + + + The AssignmentManager looks at the existing region assignments + in META. + + + If the region assignment is still valid (i.e., if the RegionServer) is still online + then the assignment is kept. + + + + If the assignment is invalid, then the LoadBalancerFactory is invoked to assign the + region. The DefaultLoadBalancer will randomly assign the region to a RegionServer. + + + + +
+ +
+ Failover + When a RegionServer fails (short version): + + + + The regions immediately become unavailable because the RegionServer is down. + + + The Master will detect that the RegionServer has failed. + + + The region assignments will be considered invalid and will be re-assigned just + like the startup sequence. + + + + +
+ +
+ Region Load Balancing + + Regions can be periodically moved by the . + +
+ +
+ +
+ Region-RegionServer Locality + Over time, Region-RegionServer locality is achieved via the an aspect of + HDFS block replication. The HDFS client when choosing where to write it replicas, + by default does as follows: + + First replica is written to local node + + Second replica to another node in same rack + + Third replica to a node in another rack (if sufficient nodes) + + + HBase eventually achieves locality for a region after a flush a compaction. + In a RegionServer failover situation a RegionServer may be assigned regions with non-local + StoreFiles (i.e., none of the replicas are local), however eventually as new data is written + in the region, or the table is compacted and StoreFiles are re-written, they will become "local" + to the RegionServer. + + For more information, see HDFS Design on Replica Placement + and also Lars George's blog on HBase and HDFS locality. + +
+
Region Splits @@ -1725,15 +1811,6 @@ splits (and for why you might do this)
-
- Region Load Balancing - - - Regions can be periodically moved by the . - - -
-
Store A Store hosts a MemStore and 0 or more StoreFiles (HFiles). A Store corresponds to a column family for a table for a given region. @@ -2729,13 +2806,15 @@ Getting The Most From Your HBase Install by Ryan Rawson, Jonathan Gray (Hadoop World 2009).
-
Papers +
HBase Papers BigTable by Google (2006). + HBase and HDFS Locality by Lars George (2010). + No Relation: The Mixed Blessings of Non-Relational Databases by Ian Varley (2009).
-
Sites +
HBase Sites Cloudera's HBase Blog has a lot of links to useful HBase information. CAP Confusion is a relevant entry for background information on @@ -2746,10 +2825,15 @@ HBase Wiki has a page with a number of presentations.
-
Books +
HBase Books HBase: The Definitive Guide by Lars George.
+
Hadoop Books + Hadoop: The Definitive Guide by Tom White. + +
+ HBase and the Apache Software Foundation