Thanks Nik: I can help with that one!
Another question: about the MergedIterator
I can see the possible use case here, but I think it deserves some discussion first (versus just making it public).
This thing has limitations (its currently only used by indexwriter for buffereddeletes, its basically like a MultiTerms over an Iterator). For example each iterator it consumes should not have duplicate values according to its compareTo(): its not clear to me this WeightedPhraseInfo behaves this way:
- what if you have a synonym of "dog" sitting on top of "cat" with the same boost factor... its a duplicate according to that compareTo, but the text is different.
- what if the synonym is just "dog" with posinc=0 stacked ontop of itself (which is totally valid to do)...
Perhaps highlighting can make use of it, but its unclear to me that its really following the contract. Furthermore the class in question (WeightedPhraseInfo) is public, and adding Comparable to it looks like it will create a situation where its inconsistent with equals()... I think this is a little dangerous.
If it turns out we can reuse it: great! But i think rather than just slapping public on it, we should move it to .util, ensure it has good javadocs and unit tests, and investigate what exactly happens when these contracts are violated: e.g. can we make an exception happen rather than just broken behavior in a way that won't hurt performance and so on?