Index: src/docbkx/book.xml =================================================================== --- src/docbkx/book.xml (revision 1161343) +++ src/docbkx/book.xml (working copy) @@ -192,22 +192,23 @@ i.e. you query one column family or the other but usually not both at the one time. -
- - Monotonically Increasing Row Keys/Timeseries Data - - - In the HBase chapter of Tom White's book Hadoop: The Definitive Guide (O'Reilly) there is a an optimization note on watching out for a phenomenon where an import process walks in lock-step with all clients in concert pounding one of the table's regions (and thus, a single node), then moving onto the next region, etc. With monotonically increasing row-keys (i.e., using a timestamp), this will happen. See this comic by IKai Lan on why monotically increasing row keys are problematic in BigTable-like datastores: +
Rowkey Design +
+ + Monotonically Increasing Row Keys/Timeseries Data + + + In the HBase chapter of Tom White's book Hadoop: The Definitive Guide (O'Reilly) there is a an optimization note on watching out for a phenomenon where an import process walks in lock-step with all clients in concert pounding one of the table's regions (and thus, a single node), then moving onto the next region, etc. With monotonically increasing row-keys (i.e., using a timestamp), this will happen. See this comic by IKai Lan on why monotonically increasing row keys are problematic in BigTable-like datastores: monotonically increasing values are bad. The pile-up on a single region brought on - by monoticially increasing keys can be mitigated by randomizing the input records to not be in sorted order, but in general its best to avoid using a timestamp or a sequence (e.g. 1, 2, 3) as the row-key. - + by monotonically increasing keys can be mitigated by randomizing the input records to not be in sorted order, but in general its best to avoid using a timestamp or a sequence (e.g. 1, 2, 3) as the row-key. + - If you do need to upload time series data into HBase, you should - study OpenTSDB as a - successful example. It has a page describing the schema it uses in - HBase. The key format in OpenTSDB is effectively [metric_type][event_timestamp], which would appear at first glance to contradict the previous advice about not using a timestamp as the key. However, the difference is that the timestamp is not in the lead position of the key, and the design assumption is that there are dozens or hundreds (or more) of different metric types. Thus, even with a continual stream of input data with a mix of metric types, the Puts are distributed across various points of regions in the table. - + If you do need to upload time series data into HBase, you should + study OpenTSDB as a + successful example. It has a page describing the schema it uses in + HBase. The key format in OpenTSDB is effectively [metric_type][event_timestamp], which would appear at first glance to contradict the previous advice about not using a timestamp as the key. However, the difference is that the timestamp is not in the lead position of the key, and the design assumption is that there are dozens or hundreds (or more) of different metric types. Thus, even with a continual stream of input data with a mix of metric types, the Puts are distributed across various points of regions in the table. +
Try to minimize row and column sizes @@ -231,8 +232,8 @@ the thread a question storefileIndexSize up on the user mailing list. - Most frequently small inefficiencies don't matter all that much. Unfortunately, - this is a case where it does. Whatever patterns are selected for ColumnFamilies, attributes, and rowkeys they could be repeated + Most of the time small inefficiencies don't matter all that much. Unfortunately, + this is a case where they do. Whatever patterns are selected for ColumnFamilies, attributes, and rowkeys they could be repeated several billion times in your data
Column Families Try to keep the ColumnFamily names as small as possible, preferably one character (e.g. "d" for data/default). @@ -243,14 +244,33 @@ to store in HBase.
-
Row Key +
Rowkey Length Keep them as short as is reasonable such that they can still be useful for required data access (e.g., Get vs. Scan). A short key that is useless for data access is not better than a longer key with better get/scan properties. Expect tradeoffs when designing rowkeys.
-
-
+
+
Reverse Timestamps + A common problem in database processing is quickly finding the most recent version of a value. A technique using reverse timestamps + as a part of the key can help greatly with a special case of this problem. Also found in the HBase chapter of Tom White's book Hadoop: The Definitive Guide (O'Reilly), + the technique involves appending (Long.MAX_VALUE - timestamp) to the end of any key, e.g., [key][reverse_timestamp]. + + The most recent value for [key] in a table can be found by performing a Scan for [key] and obtaining the first record. Since HBase keys + are in sorted order, this key sorts before any older row-keys for [key] and thus is first. + + This technique would be used instead of using HBase Versioning where the intent is to hold onto all versions + "forever" (or a very long time) and at the same time quickly obtain access to any other version by using the same Scan technique. + +
+
Immutability of Rowkeys + Rowkeys cannot be changed. The only way they can be "changed" in a table is if the row is deleted and then re-inserted. + This is a fairly common question on the HBase dist-list so it pays to get the rowkeys right the first time (and/or before you've + inserted a lot of data). + +
+
+
Number of Versions @@ -262,12 +282,14 @@ stores different values per row by time (and qualifier). Excess versions are removed during major compactions. The number of versions may need to be increased or decreased depending on application needs. -
-
- - Minimum Number of Versions - - Like number of row versions, the minimum number of row versions to keep is configured per column + It is not recommended setting the number of versions to an exceedingly high level (e.g., hundreds or more) unless those old values are + very dear to you because this will greatly increase StoreFile size. + +
+ + Minimum Number of Versions + + Like number of row versions, the minimum number of row versions to keep is configured per column family via HColumnDescriptor. The default is 0, which means the feature is disabled. The minimum number of row versions parameter is used together with the time-to-live parameter and can be combined with the @@ -276,16 +298,8 @@ (where M is the value for minimum number of row versions, M<=N). This parameter should only be set when time-to-live is enabled for a column family and must be less or equal to the number of row versions. - -
-
- - Immutability of Rowkeys - - Rowkeys cannot be changed. The only way they can be "changed" in a table is if the row is deleted and then re-inserted. - This is a fairly common question on the HBase dist-list so it pays to get the rowkeys right the first time (and/or before you've - inserted a lot of data). - + +
@@ -861,6 +875,64 @@ <chapter xml:id="architecture"> <title>Architecture +
+ Catalog Tables + + +
+ ROOT + -ROOT- keeps track of where the .META. table is. The -ROOT- table structure is as follows: + + Key: + + .META. region key (.META.,,1) + + + Values: + + info:regioninfo (serialized HRegionInfo + instance of .META.) + info:server (server:port of the RegionServer holding .META.) + info:serverstartcode (start-time of the RegionServer process holding .META.) + + +
+
+ META + The .META. table keeps a list of all regions in the system. The .META. table structure is as follows: + + Key: + + Region key of the format ([table],[region start key],[region id]) + + + Values: + + info:regioninfo (serialized + HRegionInfo instance for this region) + + info:server (server:port of the RegionServer containing this region) + info:serverstartcode (start-time of the RegionServer process containing this region) + + + When a table is in the process of splitting two other columns will be created, info:splitA and info:splitB + which represent the two daughter regions. The values for these columns are also serialized HRegionInfo instances. + After the region has been split eventually this row will be deleted. + + Notes on HRegionInfo: the empty key is used to denote table start and table end. A region with an empty start key + is the first region in a table. If region has both an empty start and an empty end key, its the only region in the table + + In the (hopefully unlikely) event that programmatic processing of catalog metadata is required, see the + Writables utility. + +
+
+ Startup Sequencing + The META location is set in ROOT first. Then META is updated with server and startcode values. + +
+
+
Client The HBase client