it gets better and better, i saw
LUCENE-1340 committed. Thanks to you Grant, Doug and all others that voted for 1349 this happened so quickly. Trust me, these two issues are really making my life easier. I pushed decision to add new hardware to some future point (means, save customer's money now)... a few weeks later would be too late.
Now it remains only to make one nice patch that enables us to pass our own byte for retrieving stored fields during search. I was thinking along the lines of things you did in Analyzers.
we could pool the same trick for this, eg.
Field Document.getBinaryValue(String FIELD_NAME, Field destination);
Field already has all access methods (get/set),
the contract would be: If destination==null, new one will be created and returned, if not we use this one and returne the same object back. The method should check if byte is big enough, if not simple growth policy can be there. This way we avoid new byte each time you fetch stored field..
I did not look exactly at code now, but the last time I was looking into it it looked as quite simple to do something along these lines. Do you have some ideas how we could do it better?
Just simple calculation in my case,
average Hits count is around 200, for each hit we have to fetch one stored field where we do some post-processing, re-scoring and whatnot. Currently we run max 30 rq/second , with average document length of 2k you lend at 2K * 200 * 30 = 6000 object allocations per second totaling 12Mb ... only to get the data... I can imagine people with much longer documents (that would be typical lucene use case) where it gets worse... simply reducing gc() pressure with really small amount of work. I am sure this would have nice effects on some other use cases in lucene.
thanks again to all "workers" behind this greet peace of software...
PS: I need to find some time to peek at paul's work in LUVENE -1345 and my wish list will be complete, at least for now (at least until you get your magic with flexi index format done