Attached patch implements a prefetching function for HFile (v3) blocks, if indicated by a column family or regionserver property. The purpose of this change is to as rapidly after region open as reasonable warm the blockcache with all the data and index blocks of (presumably also in-memory) table data, without counting those block loads as cache misses. Great for fast reads and keeping the cache hit ratio high. Can tune the IO impact versus time until all data blocks are in cache. Works a bit like CompactSplitThread. Makes some effort not to stampede.
I have been using this for setting up various experiments and thought I'd polish it up a bit and throw it out there. If the data to be preloaded will not fit in blockcache, or if as a percentage of blockcache it is large, this is not a good idea, will just blow out the cache and trigger a lot of useless GC activity. Might be useful as an expert tuning option though. Or not.
|Document blockcache prefetch option||Closed|