Index: src/main/docbkx/book.xml =================================================================== --- src/main/docbkx/book.xml (revision 1525807) +++ src/main/docbkx/book.xml (working copy) @@ -3267,7 +3267,11 @@ - + +
getShortMidpointKey(an optimization for data index block) + Note: this optimization was introduced in HBase 0.95+ + Before HBASE-7845, while finalizing a data block(e.g. reach blocksize setting or close file), Hbase picks up the start key of current data block as an index entry adding into the current leaf index block, however in some way, indexing on the stop key of previous data block be treated as a better choice(see HBASE-5987,HBASE-4443 for more details), and it's not an easy thing to change this behivour without refactoring lots of low level codes. In HBASE-7845, we implemented the getShortMidpointKey method which is similar with Leveldb's ByteWiseComparatorImpl::FindShortestSeparator() and FindShortSuccessor(). The core of getShortMidpointKey is to generate an "virtual" key which bigger than the stop key of previous data block, and less or equal to the start key of current data block, and the gap between the stop key of previous data block and the "virtual" key is as small as possible, also we make sure that the length of the "virtual" key is as small as possible. e.g. the stop key of previous block is "the quick brown fox", the start key of current block is "the who", then getShortMidpointKey can generate an "virtual" key like "the r" as the new index entry. It brings two benefits at least:1)reduce the hfile data index size. 2)avoid extra seeking to the previous data block request if the target key is in the range of ["virtual key","start key of current block"] +
Other Information About HBase