TermWeight currently loads norms all the time, even when needsScores is false.
Here is a patch.
Should we pull this out and for consistency use in other places like PhraseWeight/SpanWeight/etc?
I like the non-scoring SimScorer We should indeed factor it out as a "utility class", so it can be reused by other queries, too.
Here is a new patch that fixes all queries. I wanted to make it impossible to forget to apply this optimization so the way the patch works is that IndexSearcher.getSimilarity now takes a boolean needsScores parameter too and returns a dummy similarity when scores are not needed. This forces all queries that need to work with a Similarity to propagate needsScores to index searcher to make sure we do not load eg. norms when scores are not needed.
For SpanWeight, you don't need to pass needsScores back down to the constructor, as if the TermContexts map is null then you know scores are not required. So you can just call IndexSearcher.getSimilarity(termContexts != null).
Thanks Alan, that makes sense especially if we want to avoid exploding the number of parameters. I just updated the patch with your suggestion.
Robert, Uwe, would you mind having a look at the latest patch? I think this bug is a good candidate to 5.2.1?
Thanks for fixing in a generic way that works for all queries.
I suppost this constant NON_SCORING_SIMILARITY could either go in IndexSearcher or be in Similarity. I think it would feel a bit better in Similarity, as it would mirror how some other classes have empty/no-op style special instances on the class it is an instance of.
The reason I put in on IndexSearcher is that it can only be used for searching as it can't compute norms. So I think it makes sense to keep in on IndexSearcher and private so that it is less likely to get used by accident.
I would better understand your point if it was all but unusable – e.g. threw UnsupportedOperation exceptions left & right but it seems like a constant Similarity that is innocent enough; I don't know how anyone would use this class by accident. But my point isn't important; just a matter of taste.
Commit 1684502 from Adrien Grand in branch 'dev/trunk'
[ https://svn.apache.org/r1684502 ]
LUCENE-6527: Queries now get a dummy Similarity when scores are not needed in order to not load unnecessary information like norms.
Commit 1684506 from Adrien Grand in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1684506 ]
Commit 1684528 from Adrien Grand in branch 'dev/trunk'
[ https://svn.apache.org/r1684528 ]
LUCENE-6527: Fix rare test bug.
Commit 1684530 from Adrien Grand in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1684530 ]
Commit 1684531 from Adrien Grand in branch 'dev/branches/lucene_solr_5_2'
[ https://svn.apache.org/r1684531 ]