Details
Description
I noticed a weird payload behavior with Solr 6.3.0, also 7.7.2 and 8.3.1. After writing the Lucene62Codec specific unit test (see attached, also can be run with the later versions) I think there could be a bug which allows for the same term payloads to be written into another document's same term payload (or the second payload for the second document not being read correctly).
For comparison, I added SimpleTextCodec which doesn't behave this way.
For 8.3.1, you will need to change MultiFields.getTermPositionsEnum(...) to MultiTerms.getTermPostingsEnum(...).
Thanks to Alan Woodward, I made the necessary changes to the analyzer to address the sharing of the TokenStreamComponents which was used in the TestPayloads class. Now I use non-mocked tokenizer and a new filter which would create a random payload (see attached). So, doc one and two will have the same token, but different payloads.
Same idea, SimpleTextCodec passes the test, but these ones don't:
Lucene50Codec;
Lucene54Codec;
Lucene62Codec;
Lucene70Codec;
Lucene80Codec;