Description
The problem is simple:
- term vectors can be configured "per-field-per-document", meaning for the "body" field, document 0 can have them, document 1 maybe doesnt at all, document 2 maybe has offsets (no positions), and so on. To me this is not a useful feature at all, no one has ever mentioned a single use case for this, and it just makes our code more complicated. but it is what it is (for this issue)
- there is no way to discover these options for a field of a document, you have to do things like 'peek ahead' to see the first position of the first term is -1, or same for offsets (except worse, we used to allow anything in offsets so -1 might be an actual value). This makes the merging code really hairy, and tough on end consumers.
So I propose that instead of returning Terms for Vectors, we return VectorTerms (extends Terms), which just adds hasOffsets() and hasPositions(). e.g. lucene40 already knows this from the bits for the field/doc pair and just returns what it knows.