Assigning to myself to not lose track of it.
LUCENE-7705 introduced the ability to define the allowable token length for these tokenizers other than hard-code it to 255. It's always been the case that when the hard-coded limit was exceeded, multiple tokens would be emitted. However, the tests for LUCENE-7705 exposed a problem.
Suppose the max length is 3 and the doc contains "letter". Two tokens are emitted and indexed: "let" and "ter".
Now suppose the search is for "lett". If the default operator is AND or phrase queries are constructed the query fails since the tokens emitted are "let" and "t". Only if the operator is OR is the document found, and even then it won't be correct since searching for "lett" would match a document indexed with "bett" because it would match on the bare "t".
The remainder of the token should be ignored when maxTokenLen is exceeded.
I'm not quite sure why master generates a parsed query of:
and 6x generates
so the tests succeeded on master but not on 6x....