Row Groups / Row Family / Entity Groups in HBase =================== What? ----- Row Groups are semantic groupings of rows in the Hbase data model. All rows within a given row group share the same row group key. Row groups are similar to column families in HBase or locality groups in BigTable, but transposed to rows instead of columns. All the rows within a row group physically belong together, and served by a single region. This means that region boundaries cannot split the row group. Row groups are not predefined, and are dynamic. There can be one row group per row. Row keys are fully optional, and backwards compatible. Why? ---- We may need row keys, because of two reasons: 1. native support for in-region local transactions (http://hadoop-hbase.blogspot.com/2012_02_01_archive.html, https://issues.apache.org/jira/browse/HBASE-5229) 2. Allowing the user to better manage region splitting, without writing a custom RegionSplitPolicy. 3. Performance HBase does not expose regions as an API, however to natively implement 1 or 2, we need to expose a way for the user to define the row keys that are co-located in the same region. For example KeyPrefixRegionSplitPolicy kind of does that, but it only applies if you have fixed length key prefixes. Otherwise, for each table, you have to write a custom RegionSplitPolicy, which understands your compound row keys, and parses the split points. For 3, we can also make use of row groups in block encoding (like prefix encoding) by encoding them once per row group, or develop filters that understand row groups if set, so that you can skip further. How? ---- Entity Groups in MegaStore are used exactly to support local transactions. However, they have implemented entiry groups to correspond to only one row. This is also possible in HBase, since there are row transaction, and sufficient filter support. However, performance-wise, very wide rows is not a good idea at least for now. Implement as co-processors vs native API? Local transactions are implemented as coprocessors right now, but if we want to expose these as generic API for users, we should consider the native approach. And given that we might need local transactions in META, etc, I think it is a good idea to go for the native approach. Row groups will fit nicely in the HBase data model. Example: -------- META table contains row keys like: test_table,a,1359411246232.b07d0034cbe72cb040ae9cf66300a10c. That is + "," + "," + + "." + encodedName We want local transaction in META table, due to various reasons (see BlockingMetaScannerVisitor, online merge, etc). The easiest way to achieve this would be to make the as the row group, which would automatically mean that all regions of a table belongs to one META region. This assumption is acceptable (considering 10G regions), and is way better than current limitation of 1 META region. Then if we provide native API for local transactions, we are done. A second example is a file system API on top of HBase. You might end of splitting the files into blocks, and have row keys like: #. If you want an atomic rename for the file name, you may need local transactions for the rows of the file. Of course, you can also do a "wide table" rather than a "tall table", but having row keys enables a "tall table" schema design. Links: ------ Megastore Entity Groups, www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf G-Store Row-Groups: www.cs.ucsb.edu/~sudipto/papers/socc10-das.pdf Cassandra issue: https://issues.apache.org/jira/browse/CASSANDRA-1684