Index: src/main/docbkx/book.xml
===================================================================
--- src/main/docbkx/book.xml (revision 1525807)
+++ src/main/docbkx/book.xml (working copy)
@@ -3267,7 +3267,11 @@
-
+
+ getShortMidpointKey(an optimization for data index block)
+ Note: this optimization was introduced in HBase 0.95+
+ Before HBASE-7845, while finalizing a data block (e.g. reach blocksize setting or close file), HBase picks up the start key of current data block as an index entry adding into the current leaf index block, however in some way, indexing on the stop key of previous data block can be treated as a better choice (see HBASE-5987 and HBASE-4443 for more details). It's not an easy thing to change this behavior without refactoring lots of low level codes. HBASE-7845 implemented the getShortMidpointKey method which is similar with Leveldb's ByteWiseComparatorImpl::FindShortestSeparator() and FindShortSuccessor(). The core of getShortMidpointKey is to generate a "virtual" key bigger than the stop key of previous data block, and smaller or equal to the start key of the current data block. The gap between the stop key of previous data block and the "virtual" key is as small as possible. Also we make sure that the length of the "virtual" key is as small as possible. e.g. the stop key of previous block is "the quick brown fox", the start key of current block is "the who", then getShortMidpointKey can generate a "virtual" key like "the r" as the new index entry. It brings two benefits at least:1) Reduce the hfile data index size. 2) Avoid extra seeking to the previous data block request if the target key is in the range of ["virtual key","start key of current block"]
+
Other Information About HBase