Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
New
Description
Spinoff from this fun build failure that dweiss root caused: https://lucene.markmail.org/thread/pculfuazll4oebra
Thank you and sorry dweiss!!
This test failure happened because the test case randomly indexed a chunk of the nightly (many GBs) LineFileDocs Wikipedia file that had a massive (> IW's ~32 KB limit) term, and IW threw an IllegalArgumentException failing the test.
It's crazy that it took so long for Lucene's randomized tests to discover this too-massive term in Lucene's nightly benchmarks. It's like searching for Nessie, or SETI.
We need to prevent such false failures, somehow, and there are multiple options: fix this test to not use LineFileDocs, remove all "massive" terms from all tests (nightly and git) LineFileDocs, fix MockTokenizer to trim such ridiculous terms (I think this is the best option?), ...
Attachments
Issue Links
- links to