Maybe BasePostingsFormatTestCase does not adequately exercise methods like size()/ord()/seek(ord). It should be failing!
FWIW, as far as i understand BasePostingsFormatTestCase and RandomPostingsTester based on skimming them this morning, they may not ever reproduce this bug since (AFAICT) only ever operate on single segment indexes?
As mentioned: this patch only ever fails for me when testing the SlowCompositeReaderWrapper – asserts on the individual segment LeafReaders seem to pass all the time (even though one segment is forced to have every term that's in the index as a whole). Likewise if you iw.forceMerge(1); then the SlowCompositeReaderWrapper asserts start to pass as well.
I've updated the patch to include the test from
SOLR-7631, as well as beefing up UninvertingReader.tTetestSortedSetIntegerManyValues to include all (4) permutations of multi/single-valued + (no)-precisionStep, (didn't turn up anything unexpected, only the trie fields are problematic) as well as to running TestUtil.checkReader on the SlowCompositeReader before using it. This last change started triggering failure much earlier...
[junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestUninvertingReader -Dtests.method=testSortedSetIntegerManyValues -Dtests.seed=3A8A592786F36F30 -Dtests.slow=true -Dtests.locale=in_ID -Dtests.timezone=Zulu -Dtests.asserts=true -Dtests.file.encoding=UTF-8
[junit4] ERROR 0.56s | TestUninvertingReader.testSortedSetIntegerManyValues <<<
[junit4] > Throwable #1: java.lang.RuntimeException: dv for field: trie_multi reports wrong maxOrd=33 but this is not the case: 30
[junit4] > at __randomizedtesting.SeedInfo.seed([3A8A592786F36F30:DB56E81A1372E276]:0)
[junit4] > at org.apache.lucene.index.CheckIndex.checkSortedSetDocValues(CheckIndex.java:1917)
[junit4] > at org.apache.lucene.index.CheckIndex.checkDocValues(CheckIndex.java:1987)
[junit4] > at org.apache.lucene.index.CheckIndex.testDocValues(CheckIndex.java:1790)
[junit4] > at org.apache.lucene.util.TestUtil.checkReader(TestUtil.java:318)
[junit4] > at org.apache.lucene.util.TestUtil.checkReader(TestUtil.java:297)
[junit4] > at org.apache.lucene.uninverting.TestUninvertingReader.testSortedSetIntegerManyValues(TestUninvertingReader.java:284)
...so for good measure, i sprinkled in TestUtil.checkReader in some of the other oal.univerting.* tests i could find using SlowCompositeReader – but based on my limited beasting, this hasn't triggered any other failures.
(note: patch still has nocommits related to limiting some of the random variables)
If i disable the ord-sharing optimization in DocTermOrds, all 3 seeds pass. So I think there is a bug in e.g. FixedGap/BlockTerms dictionary or something like that.
My inclination would be that we should remove this optimization for 5.2.1, commit these tests, and open a new issue to re-add the optimization if/when if can be done in such a way that these tests pass reliably.
what do folks think?