How about renaming key back to ord? And then maybe rename values to
bytesStart? And in their decls add comments saying they are indexed
by hash code? And maybe rename addByOffset -> addByBytesStart?
I don't like addByBytesStart I would like to keep offset since it really is an offset into the pool. addByPoolOffset?
The names ord and bytesStart are a good compromise lets shoot for that.
On the nocommit in ByteBlockPool - I think that's fine? It's an
you refer to this: // nocommit - public arrays are not nice! ?
yeah that more of an style thing but if somebody changes them its their fault for being stupid I guess.
The nocommit in BytesRefHash seems wrong? (Ie, compact is used
internally)... though maybe we make it private if it's not used
Ah yeah thats bogus - its from a previous iteration which was wrong as well, I will remove.
On the "nocommit factor this out!" in THPF.java... I agree, the
postingsArray.textStarts should go away right? Ie, it's a
[wasteful] copy of what the BytesRefHash is already storing?
Yeah that is the reason for that nocommit. Yet, I though about this a little and I have two options for this.
- we could factor out a super class from ParallelPostingArray which only has the textStart int array, the grow and copy method and let ParallelPostingArray subclass it.
BytesRefHash would accept this class, don't have a good name for it but lets call it TextStartArray for now, and use it internally. It would call grow() once needed inside BytesRefHash and all the other code would be unchanged since PPA is a subclass.
- the other way would be to bind the ByteRefHash to the postings array which seems odd to me though.
Can we impl BytesRefHash.bytesUsed as an AtomicLong (hmm maybe
AtomicInt - none of these classes can address > 2GB)? Then the
pool would add in blockSize every time it binds a new block. That
method (DW.bytesUsed) is called alot - at least once on every
I did exactly that in the not yet uploaded patch. But I figured that it would maybe make more sense to use that AtomicInt in the allocator as well as in THPF or is that what you mean?
I'm confused again - when do we use RecyclingByteBlockAllocator
from a single thread...? Ie, why did the sync need to be
conditional for this class, again....? It seems like we always
need it sync'd (both the main pool & per-doc pool need this)? If
so we can simplify and make these methods sync'd?
man, I am sorry - I thought I will use this in
LUCENE-2186 in a single threaded env but if so I should change it there if needed. I was one step ahead though.
I will change and maybe have a second one if needed. Agree?