[HBASE-1978] Change the range/block index scheme from [start,end) to (start, end], and index range/block by endKey, specially in HFile - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Later
Affects Version/s: None
Fix Version/s: None
Component/s: io, master, regionserver
Labels:
None

Tags:
HFile, METADATA, INDEX

Description

From the code review of HFile (~~HBASE-1818~~), we found the HFile allows duplicated key. But the old implementation would lead to missing of duplicated key when seek and scan, when the duplicated key span multiple blocks.

We provide a patch (~~HBASE-1841~~ is't step1) to resolve above issue. This patch modified HFile.Writer to avoid generating a problem hfile with above cross-block duplicated key. It only start a new block when current appending key is different from the last appended key. But it still has a rish when the user of HFile.Writer append many same duplicated key which lead to a very large block and need much memory or Out-of-memory.

The current HFile's block-index use startKey to index a block, i.e. the range/block index scheme is [startKey,endKey).

As refering to the section 5.1 of the Google Bigtable paper.

"The METADATA table stores the location of a tablet under a row key that is an encoding of the tablet's table identifer and its end row."

The theory of Bigtable's METADATA is same as the BlockIndex in a SSTable or HFile, so we should use EndKey in HFile's BlockIndex. In my experiences of Hypertable, the METADATA is also "tableID:endRow".

We would change the index scheme in HFile, from [startKey,endKey) to (startKey,endKey]. And change the binary search method to meet this index scheme.

This change can resolve above duplicated-key issue.

Note:
The totally fix need to modify many modules in HBase, seems include HFile, META schema, some internal code, etc.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBASE-1978-HFile-v1.patch
13/Nov/09 10:46
14 kB
Schubert Zhang

Issue Links

is related to

HBASE-2600 Change how we do meta tables; from tablename+STARTROW+randomid to instead, tablename+ENDROW+randomid

Closed

relates to

HBASE-1841 If multiple of same key in an hfile and they span blocks, may miss the earlier keys on a lookup

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Schubert Zhang

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 13/Nov/09 10:35

Updated:: 11/Jun/22 23:11

Resolved:: 08/Oct/20 04:54