Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
None
-
None
-
None
-
New
Description
Custom term frequencies allows expert users to index and score in custom ways, however, DefaultIndexingChain adds a limitation to this as the sum of frequencies can't overflow
try { invertState.length = Math.addExact(invertState.length, invertState.termFreqAttribute.getTermFrequency()); } catch (ArithmeticException ae) { throw new IllegalArgumentException("too many tokens for field \"" + field.name() + "\""); }
This might become an issue if for example the frequency data is encoded in a different way, say the specific scorer works with float frequencies.
The sum method can be added to TermFrequencyAttribute to get something like
invertState.length = invertState.termFreqAttribute.addFrequency(invertState.length);
so users may define the summing method and avoid the owerflow exceptions.