[HBASE-10418] give blocks of smaller store files priority in cache - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
Component/s: regionserver
Labels:
None

Description

That's just an idea at this point, I don't have a patch nor plan to make one in near future.
It's good for datasets that don't fit in memory especially; and if scans are involved.
Scans (and gets in absence of bloom filters' help) have to read from all store files. Short range request will hit one block in every file.
If small files are more likely to be entirely available in memory, on average requests will hit less blocks from FS.
For scans that read a lot of data, it's better to read blocks in sequence from a big file and blocks for small files from cache, rather than a mix of FS and cached blocks from different files, because the (HBase) blocks of a big file would be sequential in one HDFS block.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Sergey Shelukhin

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 25/Jan/14 01:40

Updated:: 17/Jun/22 15:40

Resolved:: 17/Jun/22 15:40