I will attempt to make a new O(1) cache called FastLFUCache
Please, please, please lets end the madness of subjective adjectives in class names ... if it's an LFU cache wrapped around a "hawtdb" why don't we just call it "HawtDbLFUCache" ?
I've been working on this. I've come to realize that I don't completely understand how CacheRegenerator works. I suspect that it is geared around LRU caches and that the new cache won't have any of the frequency information from the old one, it will just put the entries into the cache as if they were new. Can anyone confirm this?
The idea behind the CacheRegenerator API is to be as simple as possible and agnostic to:
- the Cache Impl (ie: LRUCache vs LFUCache vs HawtDbLFUCache)
- the cache usage (ie: Query->DocSets vs Query->DocList vs String->MyCustomClass)
- the means of generating values from keys (ie: how do you know which MyCustomClass should be cached for which String)
... so you can have a custom (named) cache instance declared in your solrconfig.xml with your own MySpecialCacheRegenerator that knows about your usecase and might do something special with the keys/values (like: short-circut part of the generation if it can see the data hasn't changed, or read from authoritative data files outside of solr, etc...) and then use any Cache impl class that you're heart desires, and things will still work right.
After the new cache is regenerated, should I go through the new cache, grab the frequency information from the old cache with each key, and fix the new cache up?
you certainly could – when (new HawtDbLFUCache(...)).warm(...) is called, it needs to delegate to the regenerator for pulling values from the "old" cache, but that doesn't mean it can't also directly ask the "old" cache instance for stats about each of the old keys as it loops over them – remember: the "new" cache is the one inspecting the "old" cache and deciding what things to ask the regenerator to generate.
But i question whether you really want any sort of stats from the "old" cache copied over to the "new" cache. it is, after all, a completely new cache – with new usage. should the stats really be preserved forever? regardless of how popular an object was in the "old" cache instance, should we automatically assume it's equally popular in the "new" cache instance?