Indexed HBase
This page gives the high levels for the indexed hbase contrib. It is assumed that the reader has in-depth knowledge of HBase. A good high level description of the HBase architecture can be found at the hbase wiki and here.
Purpose
The goal of the indexed HBase contrib is to speed up scans by indexing HBase columns. Indexed HBase (IHbase) is different from the indexed tables in transactional HBase (THbase): while the indexes in THBase are, in fact, hbase tables using the indexed column's values as row keys, IHbase creates indexes at the region level. The differences are summarized in the table below.
| Feature | THBase | IHBase | Comment |
|---|---|---|---|
| global ordering | yes | no | IHBase has an index for each region. The flip side of not having global ordering is compatibility with the good old HRegion: results are coming back in row order (and not value order as in THBase) |
| Full table scan? | no | no | THbase does a partial scan on the index table. IHbase supports specifying start/end rows to limit the number of scanned regions |
|
Multiple Index Usage |
no | yes | IHBase can take advantage of multiple indexes in the same scan. IHBase IdxScan object accepts an Expression which allows intersection/unison of several indexed column criteria |
| Extra disk storage | yes | no | IHbase indexes are created when the region starts/flushes and do not require any extra storage |
| Extra RAM | yes | yes | IHbase indexes are in memory and hence increase the memory overhead. THbase indexes increase the number of regions each region server has to support thus costing memory too |
| Parallel scanning support | no | yes | In THbase the index table needs to be consulted and then GETs are issued for each matching row. The behavior of IHBase (as perceived by the client) is no different than a regular scan and hence supports parallel scanning seamlessly. parallel GET can be implemented to speedup THbase scans |
Why do we think IHbase outperforms THBase?
-
More flexible:
- Supports range queries and multi-index queries
- Supports different types - not only byte arrays
- Less overhead: THbase pays at least two 'table roundtrips' - one for the index table and the other for the main table
- Quicker index expression evaluation: IHBase is using dedicated index data structures while THbase is using the regular HRegion scan facilities
Usage
To use Indexed HBase do the following:
-
Set the hbase.region.impl property to IdxRegion
IdxRegion HBase configuration snippet
<property> <name>hbase.hregion.impl</name> <value>org.apache.hadoop.hbase.regionserver.IdxRegion</value> </property>
-
When creating a table define which columns to index using IdxColumnDescriptor. The supported types are all the java primitive data types except boolean, byte[], char[] and BigDecimal
Creating an HTable with an index on family:qual column
Note that this snippet assumes that all the values assigned to family:qual are exactly 8 bytes, preferrably created using Bytes.toBytes(long). The table may have rows in which family:qual is missing, those rows will not be included in the index.
byte[] tableName = Bytes.toBytes("table"); byte[] familyName = Bytes.toBytes("family"); byte[] qualifier = Bytes.toBytes("qual"); IdxColumnDescriptor idxColumnDescriptor = new IdxColumnDescriptor(familyPairName); IdxIndexDescriptor indexDescriptor = new IdxIndexDescriptor(qualifier, IdxQualifierType.LONG); idxColumnDescriptor.addIndexDescriptor(indexDescriptor); HTableDescriptor htd = new HTableDescriptor(tableName); htd.addFamily(idxColumnDescriptor); HBaseConfiguration conf = new HBaseConfiguration(); conf.setClass(HConstants.REGION_IMPL, IdxRegion.class, IdxRegion.class); HBaseAdmin admin = new HBaseAdmin(conf); admin.createTable(htd); HTable table = new HTable(conf, desc.getName()); . . .
-
When scanning make sure you instantiate an IdxScan and that you set the Expression property
Indexed scans
Notes:
- Setting an expression doesn't exclude setting a mathcing filter. This duplication is absolutely essential for getting correct scan results
- The index expression must accept any row accepted by the filter
- The filter may accept a subset of the rows accepted by the index expression (e.g. narrow down the results set)
- Setting a filter without setting an expression is supported and would revert to a 'good old scan'
- The supported expression types are comparison, and, or. Comparisons support GT, GTE, EQ, LTE, LT
-
The caller may combine any number of index expressions using any of the existing indexes. Trying to add an expression for a non-indexed column would result in a runtime error
. . . IdxScan idxScan = new IdxScan(); idxScan.setExpression(Expression.comparison(familyName, qualifier, Comparison.Operator.EQ, Bytes.toBytes(42L)); idxScan.setFilter(new SingleColumnValueFilter(familyName, qualifier, CompareFilter.CompareOp.EQUAL, Bytes.toBytes(42L))); idxScan.setCaching(1000); ResultScanner scanner = table.getScanner(idxScan); for (Result res : scanner) { // Do stuff with res }
Implementation notes
- We only index Store files. Every index scan performs a full memstore scan. Indexing the memstore will be implemented only if scanning the memstore will prove to be a performance bottleneck
- Index expression evaluation is performed using bitsets. There are two types of bitsets: compressed and expanded. An index will typically store a compressed bitset while an expression evaluator will most probably use an expanded bitset
- TODO