Index: src/docbkx/performance.xml
===================================================================
--- src/docbkx/performance.xml (revision 1201939)
+++ src/docbkx/performance.xml (working copy)
@@ -140,10 +140,13 @@
The number of regions for an HBase table is driven by the . Also, see the architecture
section on
- A lower number of regions is preferred, generally in the range of 20 to 200
- per RegionServer. Adjust the regionsize as appropriate to achieve this number. There
- are some clusters that set the regionsize to 20Gb, for example, so you may need to
- experiment with this setting based on your hardware configuration and application needs.
+ A lower number of regions is preferred, generally in the range of 20 to low-hundreds
+ per RegionServer. Adjust the regionsize as appropriate to achieve this number.
+
+ For the 0.90.x codebase, the upper-bound of regionsize is about 4Gb.
+ For 0.92.x codebase, due to the HFile v2 change much larger regionsizes can be supported (e.g., 20Gb).
+
+ You may need to experiment with this setting based on your hardware configuration and application needs.
@@ -155,12 +158,6 @@
something you want to consider.
-
- Compression
- Production systems should use compression with their column family definitions. See for more information.
-
-
-
hbase.regionserver.handler.countSee .
@@ -218,7 +215,52 @@
Key and Attribute LengthsSee .
-
+ Table RegionSize
+ The regionsize can be set on a per-table basis via setFileSize on
+ HTableDescriptor in the
+ event where certain tables require different regionsizes than the configured default regionsize.
+
+ See for more information.
+
+
+
+ Bloom Filters
+ Bloom Filters can be enabled per-ColumnFamily.
+ Use HColumnDescriptor.setBloomFilterType(NONE | ROW |
+ ROWCOL) to enable blooms per Column Family. Default =
+ NONE for no bloom filters. If
+ ROW, the hash of the row will be added to the bloom
+ on each insert. If ROWCOL, the hash of the row +
+ column family + column family qualifier will be added to the bloom on
+ each key insert.
+ See HColumnDescriptor and
+ for more information.
+
+
+ ColumnFamily BlockSize
+ The blocksize can be configured for each ColumnFamily in a table, and this defaults to 64k. Larger cell values require larger blocksizes.
+ There is an inverse relationship between blocksize and the resulting StoreFile indexes (i.e., if the blocksize is doubled then the resulting
+ indexes should be roughly halved).
+
+ See HColumnDescriptor
+ and for more information.
+
+
+
+ In-Memory ColumnFamilies
+ ColumnFamilies can optionally be defined as in-memory. Data is still persisted to disk, just like any other ColumnFamily.
+ In-memory blocks have the highest priority in the , but it is not a guarantee that the entire table
+ will be in memory.
+
+ See HColumnDescriptor for more information.
+
+
+
+ Compression
+ Production systems should use compression with their ColumnFamily definitions. See for more information.
+
+
+
Writing to HBase
Index: src/docbkx/book.xml
===================================================================
--- src/docbkx/book.xml (revision 1201939)
+++ src/docbkx/book.xml (working copy)
@@ -545,7 +545,8 @@
admin.enableTable(table);
See for more information about configuring client connections.
-
+ Note: online schema changes are supported in the 0.92.x codebase, but the 0.90.x codebase requires the table
+ to be disabled.
@@ -739,17 +740,6 @@
-
-
- In-Memory ColumnFamilies
-
- ColumnFamilies can optionally be defined as in-memory. Data is still persisted to disk, just like any other ColumnFamily.
- In-memory blocks have the highest priority in the , but it is not a guarantee that the entire table
- will be in memory.
-
- See HColumnDescriptor for more information.
-
- Time To Live (TTL)ColumnFamilies can set a TTL length in seconds, and HBase will automatically delete rows once the expiration time is reached.
@@ -775,20 +765,6 @@
See HColumnDescriptor for more information.
-
- Bloom Filters
- Bloom Filters can be enabled per-ColumnFamily.
- Use HColumnDescriptor.setBloomFilterType(NONE | ROW |
- ROWCOL) to enable blooms per Column Family. Default =
- NONE for no bloom filters. If
- ROW, the hash of the row will be added to the bloom
- on each insert. If ROWCOL, the hash of the row +
- column family + column family qualifier will be added to the bloom on
- each key insert.
- See HColumnDescriptor and
- for more information.
-
-
Secondary Indexes and Alternate Query Paths
@@ -874,6 +850,11 @@
+ Operational and Performance Configuration Options
+ See the Performance section for more information operational and performance
+ schema design options, such as Bloom Filters, Table-configured regionsizes, and blocksizes.
+
+