Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7395

Query Norm and coordination factor not calculated when PerFieldSimilarityWrapper is used

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.3.1, 5.4.1
    • Fix Version/s: 6.2
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      If any kind of similarity is defined and therefore the SchemaSimilarityFactory is defined as global similarity the queryNorm is always 1.0

      The PerFieldSimilarityWrapper delegates some of the methods to the desired Similarity but misses to delegate public float queryNorm(float valueForNormalization)
      Instead the IndexReader calls this method on the base class Similarity.

      The result is that all scores are much higher.

      I created a custom similarity which extends ClassicSimilarity.
      To have the calculation fixed I did a local "hotfix" which always uses the default similarity. Also wrong for some cases but fine in my scenario.

      @Override
      public float queryNorm(float valueForNormalization)

      { return get("").queryNorm(valueForNormalization); // use default similarity to calculate }
      1. LUCENE-7395.patch
        15 kB
        Uwe Schindler

        Issue Links

          Activity

          Hide
          saschamarkus Sascha Markus added a comment -

          Of couse the same applies to methods like public float coord(int overlap, int maxOverlap)

          Show
          saschamarkus Sascha Markus added a comment - Of couse the same applies to methods like public float coord(int overlap, int maxOverlap)
          Hide
          thetaphi Uwe Schindler added a comment -

          This affects also Lucene, so I moved it.

          Show
          thetaphi Uwe Schindler added a comment - This affects also Lucene, so I moved it.
          Hide
          dsmiley David Smiley added a comment -

          Is this a duplicate of SOLR-9315?

          Show
          dsmiley David Smiley added a comment - Is this a duplicate of SOLR-9315 ?
          Hide
          thetaphi Uwe Schindler added a comment -

          I think the Solr one is a duplicate of this one. The problem already exists in Lucene, because it returns a static constant of 1 for those methods.

          I have the feeling that PerFieldSimilarityWrapper should provide some "fallback" similraity (defaultSimilarity in Solr), too. Same as PerFieldAnalyzerWrapper. It is currently abstract and you have to implement the "per field" method. But for all those which do not use fields, it should also provide some abstract method to return a default for all methods in similarity that are not per-field.

          Show
          thetaphi Uwe Schindler added a comment - I think the Solr one is a duplicate of this one. The problem already exists in Lucene, because it returns a static constant of 1 for those methods. I have the feeling that PerFieldSimilarityWrapper should provide some "fallback" similraity (defaultSimilarity in Solr), too. Same as PerFieldAnalyzerWrapper. It is currently abstract and you have to implement the "per field" method. But for all those which do not use fields, it should also provide some abstract method to return a default for all methods in similarity that are not per-field.
          Hide
          thetaphi Uwe Schindler added a comment -

          My proposal is to add a second abstarct method like the protected get(): https://lucene.apache.org/core/6_1_0/core/org/apache/lucene/search/similarities/PerFieldSimilarityWrapper.html#get-java.lang.String-

          In that case it could be named abstract Similarity default().

          Show
          thetaphi Uwe Schindler added a comment - My proposal is to add a second abstarct method like the protected get() : https://lucene.apache.org/core/6_1_0/core/org/apache/lucene/search/similarities/PerFieldSimilarityWrapper.html#get-java.lang.String- In that case it could be named abstract Similarity default() .
          Hide
          jpountz Adrien Grand added a comment -

          I agree the default impl of coord and queryNorm on PerFieldSimilarity is trappy in 6.x. +1 to adding an abstract method or a required constructor argument.

          Show
          jpountz Adrien Grand added a comment - I agree the default impl of coord and queryNorm on PerFieldSimilarity is trappy in 6.x. +1 to adding an abstract method or a required constructor argument.
          Hide
          thetaphi Uwe Schindler added a comment -

          I think a required ctor param is also fine. The abstract method is not needed as it should always be a constant. Like the other methods, I will make the impls final. Thanks Adrien!

          As this changes API we can do this in 6.2, not earlier.

          Show
          thetaphi Uwe Schindler added a comment - I think a required ctor param is also fine. The abstract method is not needed as it should always be a constant. Like the other methods, I will make the impls final. Thanks Adrien! As this changes API we can do this in 6.2, not earlier.
          Hide
          thetaphi Uwe Schindler added a comment - - edited

          Patch using a default similarity passed through ctor. This patch also fixes the Solr issue.

          Show
          thetaphi Uwe Schindler added a comment - - edited Patch using a default similarity passed through ctor. This patch also fixes the Solr issue.
          Hide
          jpountz Adrien Grand added a comment -

          +1

          Show
          jpountz Adrien Grand added a comment - +1
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 22d24969f5b148a78684482592c63e6f976fae6c in lucene-solr's branch refs/heads/branch_6x from Uwe Schindler
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=22d2496 ]

          LUCENE-7395, SOLR-9315: Fix PerFieldSimilarityWrapper to also delegate query norm and coordination factor using a default similarity added as ctor param

          Show
          jira-bot ASF subversion and git services added a comment - Commit 22d24969f5b148a78684482592c63e6f976fae6c in lucene-solr's branch refs/heads/branch_6x from Uwe Schindler [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=22d2496 ] LUCENE-7395 , SOLR-9315 : Fix PerFieldSimilarityWrapper to also delegate query norm and coordination factor using a default similarity added as ctor param
          Hide
          mikemccand Michael McCandless added a comment -

          Bulk close resolved issues after 6.2.0 release.

          Show
          mikemccand Michael McCandless added a comment - Bulk close resolved issues after 6.2.0 release.

            People

            • Assignee:
              thetaphi Uwe Schindler
              Reporter:
              saschamarkus Sascha Markus
            • Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development