Lucene80DocValuesProducer.TermsDict is used to lookup for either a term or a term ordinal.
Currently it does not have the optimization the FSTEnum has: to be able to continue a sequential scan from where the last lookup was in the IndexInput. For sparse lookups (when searching only a few terms or ordinal) it is not an issue. But for multiple lookups in a row this optimization could save re-scanning all the terms from the block start (since they are delat encoded).
This patch proposes the optimization.
To estimate the gain, we ran 3 Lucene tests while counting the seeks and the term reads in the IndexInput, with and without the optimization:
TestLucene70DocValuesFormat - the optimization saves 24% seeks and 15% term reads.
TestDocValuesQueries - the optimization adds 0.7% seeks and 0.003% term reads.
TestDocValuesRewriteMethod.testRegexps - the optimization saves 71% seeks and 82% term reads.
In some cases, when scanning many terms in lexicographical order, the optimization saves a lot. In some case, when only looking for some sparse terms, the optimization does not bring improvement, but does not penalize neither. It seems to be worth to always have it.