Lucene - Core
  1. Lucene - Core
  2. LUCENE-6527

TermWeight should not load norms when needsScores is false

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.2.1, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      TermWeight currently loads norms all the time, even when needsScores is false.

      1. LUCENE-6527.patch
        25 kB
        Adrien Grand
      2. LUCENE-6527.patch
        41 kB
        Adrien Grand
      3. LUCENE-6527.patch
        7 kB
        Adrien Grand

        Activity

        Hide
        Adrien Grand added a comment -

        Here is a patch.

        Show
        Adrien Grand added a comment - Here is a patch.
        Hide
        Robert Muir added a comment -

        Should we pull this out and for consistency use in other places like PhraseWeight/SpanWeight/etc?

        Show
        Robert Muir added a comment - Should we pull this out and for consistency use in other places like PhraseWeight/SpanWeight/etc?
        Hide
        Uwe Schindler added a comment -

        I like the non-scoring SimScorer We should indeed factor it out as a "utility class", so it can be reused by other queries, too.

        Show
        Uwe Schindler added a comment - I like the non-scoring SimScorer We should indeed factor it out as a "utility class", so it can be reused by other queries, too.
        Hide
        Adrien Grand added a comment -

        Here is a new patch that fixes all queries. I wanted to make it impossible to forget to apply this optimization so the way the patch works is that IndexSearcher.getSimilarity now takes a boolean needsScores parameter too and returns a dummy similarity when scores are not needed. This forces all queries that need to work with a Similarity to propagate needsScores to index searcher to make sure we do not load eg. norms when scores are not needed.

        Show
        Adrien Grand added a comment - Here is a new patch that fixes all queries. I wanted to make it impossible to forget to apply this optimization so the way the patch works is that IndexSearcher.getSimilarity now takes a boolean needsScores parameter too and returns a dummy similarity when scores are not needed. This forces all queries that need to work with a Similarity to propagate needsScores to index searcher to make sure we do not load eg. norms when scores are not needed.
        Hide
        Alan Woodward added a comment -

        For SpanWeight, you don't need to pass needsScores back down to the constructor, as if the TermContexts map is null then you know scores are not required. So you can just call IndexSearcher.getSimilarity(termContexts != null).

        Show
        Alan Woodward added a comment - For SpanWeight, you don't need to pass needsScores back down to the constructor, as if the TermContexts map is null then you know scores are not required. So you can just call IndexSearcher.getSimilarity(termContexts != null) .
        Hide
        Adrien Grand added a comment -

        Thanks Alan, that makes sense especially if we want to avoid exploding the number of parameters. I just updated the patch with your suggestion.

        Show
        Adrien Grand added a comment - Thanks Alan, that makes sense especially if we want to avoid exploding the number of parameters. I just updated the patch with your suggestion.
        Hide
        Adrien Grand added a comment -

        Robert, Uwe, would you mind having a look at the latest patch? I think this bug is a good candidate to 5.2.1?

        Show
        Adrien Grand added a comment - Robert, Uwe, would you mind having a look at the latest patch? I think this bug is a good candidate to 5.2.1?
        Hide
        Robert Muir added a comment -

        +1

        Thanks for fixing in a generic way that works for all queries.

        Show
        Robert Muir added a comment - +1 Thanks for fixing in a generic way that works for all queries.
        Hide
        David Smiley added a comment -

        Nice.

        I suppost this constant NON_SCORING_SIMILARITY could either go in IndexSearcher or be in Similarity. I think it would feel a bit better in Similarity, as it would mirror how some other classes have empty/no-op style special instances on the class it is an instance of.

        Show
        David Smiley added a comment - Nice. I suppost this constant NON_SCORING_SIMILARITY could either go in IndexSearcher or be in Similarity. I think it would feel a bit better in Similarity, as it would mirror how some other classes have empty/no-op style special instances on the class it is an instance of.
        Hide
        Adrien Grand added a comment -

        The reason I put in on IndexSearcher is that it can only be used for searching as it can't compute norms. So I think it makes sense to keep in on IndexSearcher and private so that it is less likely to get used by accident.

        Show
        Adrien Grand added a comment - The reason I put in on IndexSearcher is that it can only be used for searching as it can't compute norms. So I think it makes sense to keep in on IndexSearcher and private so that it is less likely to get used by accident.
        Hide
        David Smiley added a comment -

        I would better understand your point if it was all but unusable – e.g. threw UnsupportedOperation exceptions left & right but it seems like a constant Similarity that is innocent enough; I don't know how anyone would use this class by accident. But my point isn't important; just a matter of taste.

        Show
        David Smiley added a comment - I would better understand your point if it was all but unusable – e.g. threw UnsupportedOperation exceptions left & right but it seems like a constant Similarity that is innocent enough; I don't know how anyone would use this class by accident. But my point isn't important; just a matter of taste.
        Hide
        ASF subversion and git services added a comment -

        Commit 1684502 from Adrien Grand in branch 'dev/trunk'
        [ https://svn.apache.org/r1684502 ]

        LUCENE-6527: Queries now get a dummy Similarity when scores are not needed in order to not load unnecessary information like norms.

        Show
        ASF subversion and git services added a comment - Commit 1684502 from Adrien Grand in branch 'dev/trunk' [ https://svn.apache.org/r1684502 ] LUCENE-6527 : Queries now get a dummy Similarity when scores are not needed in order to not load unnecessary information like norms.
        Hide
        ASF subversion and git services added a comment -

        Commit 1684506 from Adrien Grand in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1684506 ]

        LUCENE-6527: Queries now get a dummy Similarity when scores are not needed in order to not load unnecessary information like norms.

        Show
        ASF subversion and git services added a comment - Commit 1684506 from Adrien Grand in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1684506 ] LUCENE-6527 : Queries now get a dummy Similarity when scores are not needed in order to not load unnecessary information like norms.
        Hide
        ASF subversion and git services added a comment -

        Commit 1684528 from Adrien Grand in branch 'dev/trunk'
        [ https://svn.apache.org/r1684528 ]

        LUCENE-6527: Fix rare test bug.

        Show
        ASF subversion and git services added a comment - Commit 1684528 from Adrien Grand in branch 'dev/trunk' [ https://svn.apache.org/r1684528 ] LUCENE-6527 : Fix rare test bug.
        Hide
        ASF subversion and git services added a comment -

        Commit 1684530 from Adrien Grand in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1684530 ]

        LUCENE-6527: Fix rare test bug.

        Show
        ASF subversion and git services added a comment - Commit 1684530 from Adrien Grand in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1684530 ] LUCENE-6527 : Fix rare test bug.
        Hide
        ASF subversion and git services added a comment -

        Commit 1684531 from Adrien Grand in branch 'dev/branches/lucene_solr_5_2'
        [ https://svn.apache.org/r1684531 ]

        LUCENE-6527: Queries now get a dummy Similarity when scores are not needed in order to not load unnecessary information like norms.

        Show
        ASF subversion and git services added a comment - Commit 1684531 from Adrien Grand in branch 'dev/branches/lucene_solr_5_2' [ https://svn.apache.org/r1684531 ] LUCENE-6527 : Queries now get a dummy Similarity when scores are not needed in order to not load unnecessary information like norms.

          People

          • Assignee:
            Adrien Grand
            Reporter:
            Adrien Grand
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development