[HBASE-12311] Version stats in HFiles? - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Brainstorming
Status: Closed
Priority: Major
Resolution: Invalid
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

In ~~HBASE-9778~~ I basically punted the decision on whether doing repeated scanner.next() called instead of the issueing (re)seeks to the user.
I think we can do better.

One way do that is maintain simple stats of what the maximum number of versions we've seen for any row/col combination and store these in the HFile's metadata (just like the timerange, oldest Put, etc).

Then we estimate fairly accurately whether we have to expect lots of versions (i.e. seek between columns is better) or not (in which case we'd issue repeated next()'s).

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

12311.txt
25/Oct/14 00:14
40 kB
Lars Hofhansl
12311-indexed-0.98.txt
28/Feb/15 06:01
7 kB
Lars Hofhansl
12311-indexed-0.98-v2.txt
28/Feb/15 07:36
19 kB
Lars Hofhansl
12311-v2.txt
07/Nov/14 01:42
67 kB
Lars Hofhansl
12311-v3.txt
07/Nov/14 05:24
72 kB
Lars Hofhansl
CellStatTracker.java
22/Oct/14 21:42
2 kB
Lars Hofhansl

Issue Links

is related to

HBASE-17756 We should have better introspection of HFiles

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Lars Hofhansl

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 21/Oct/14 17:57

Updated:: 17/Jun/22 17:47

Resolved:: 04/Jul/15 00:06