I have been playing around with your patch. It looks good.
From the little testing I did, I can also say that the recommendations seem
to be more accurate than in my initial proposal (#4).
I just have one suggestion though. I think the current parameters (int
defaultMaxPrefsPerItemConsidered, int userItemCountMultiplier) are not so
clear and don't give enough control over the sampling.
I would introduce two other parameters (it won't be backwards-compatible
- maxSourcePrefsConsidered: which will be used
in conjunction with SamplingLongPrimitiveIterator to do #1.
- maxFinalPrefs : which will set the value for 'int max' in your patch
(i.e. get rid of max = (int) Math.max(defaultMaxPrefsPerItemConsidered,
userItemCountMultiplier * Math.log(Math.max(dataModel.getNumUsers(),
In the future it would be possible to add a strategy that will affect the
way maxSourcePrefsConsidered is sampled. For example, most recent items or
least recent items or random sampling (like we have now). Even though that
might not be the place to do so.. (since it's not in the context of the
What do you think?