where does the "positionIncrementGap" come in? because my naive impression is that the end() method is precisely where something like posIncAtt.setPositionIncrement(getPositionIncrementaGap()) should be called.
PositionIncrementGap is not related to this bug, because its something done separately by indexwriter. Its always factored in correctly, its not buggy.
The "bug" is if a field instance ends with a stopword, that accumulated 'hole' is lost.
doc.add(new TextField("body", "just a", Field.Store.NO));
doc.add(new TextField("body", "test of gaps", Field.Store.NO));
PhraseQuery pq = new PhraseQuery();
pq.add(new Term("body", "just"), 0);
pq.add(new Term("body", "test"), 2);
// body:"just ? test"
assertEquals(1, is.search(pq, 5).totalHits); // FAIL!
So the problem is the first instance of the field loses its hole for "a" (a stopword that was removed),
only because it ended on a stopword, so incrementToken() returned false and there was no subsequent token
to apply the 'hole' to.
This is the same problem as an instance of a field ending with say a space character and all the offsets being wrong,
this was why end() was added, so the indexer can pull 'end of stream state'.