Solr
  1. Solr
  2. SOLR-5561

DefaultSimilarity 'init' method is not called, if the similarity plugin is not explicitly declared in the schema

    Details

      Description

      As a result, discountOverlap is not initialized to true, and the default behavior is that multiple terms on the same position DO affect the fieldNorm. this is not the intended default.

      1. SOLR-5561.patch
        5 kB
        Hoss Man
      2. SOLR-5561.patch
        6 kB
        Vitaliy Zhovtyuk
      3. SOLR-5561.patch
        3 kB
        Ahmet Arslan
      4. SOLR-5561.patch
        0.8 kB
        Isaac Hebsh

        Activity

        Hide
        Isaac Hebsh added a comment -

        Patch attached.

        Show
        Isaac Hebsh added a comment - Patch attached.
        Hide
        Hoss Man added a comment -

        As a result, discountOverlap is not initialized to true, and the default behavior is that multiple terms on the same position DO affect the fieldNorm. this is not the intended default.

        Grr... that really sucks – and seems easily confusing.

        My concern about making this change as written in Isaac's patch is that it could cause some very weird inconsistencies for people who upgrade.

        My suggestion would be to do the following...

        1) in DefaultSimilarityFactory, change the declaration of the discountOverlaps variable to default to true for cnsistency (seems like a good idea in general, even if we do fix the lifecycle to ensure init() gets called)
        2) in IndexSchema.java, where Isaac's patch currently fixes the hadcoded construction of a default instance of DefaultSimilarityFactory to always call init with empty SolrParams, make the params pased to the init param contingent on the value of the luceneMatchVersion – if it's less then 4.7, pass discountOverlaps=false to the init method for backcompat, and if it's 4.7 or greater pass an empty set of params (so the factories defaults are used correctly)
        3) write some tests using trivial solrconfig.xml+schema.xml files with hardcoded version match params to verify that the correct behavior is being implemented.

        thoughts?

        Show
        Hoss Man added a comment - As a result, discountOverlap is not initialized to true, and the default behavior is that multiple terms on the same position DO affect the fieldNorm. this is not the intended default. Grr... that really sucks – and seems easily confusing. My concern about making this change as written in Isaac's patch is that it could cause some very weird inconsistencies for people who upgrade. My suggestion would be to do the following... 1) in DefaultSimilarityFactory, change the declaration of the discountOverlaps variable to default to true for cnsistency (seems like a good idea in general, even if we do fix the lifecycle to ensure init() gets called) 2) in IndexSchema.java, where Isaac's patch currently fixes the hadcoded construction of a default instance of DefaultSimilarityFactory to always call init with empty SolrParams, make the params pased to the init param contingent on the value of the luceneMatchVersion – if it's less then 4.7, pass discountOverlaps=false to the init method for backcompat, and if it's 4.7 or greater pass an empty set of params (so the factories defaults are used correctly) 3) write some tests using trivial solrconfig.xml+schema.xml files with hardcoded version match params to verify that the correct behavior is being implemented. thoughts?
        Hide
        Ahmet Arslan added a comment -

        I had this test case handy.

        Show
        Ahmet Arslan added a comment - I had this test case handy.
        Hide
        Vitaliy Zhovtyuk added a comment -

        Added changes to IndexSchema
        Added IndexSchema tests checking discountOverlaps depending on luceneMatchVersion

        Show
        Vitaliy Zhovtyuk added a comment - Added changes to IndexSchema Added IndexSchema tests checking discountOverlaps depending on luceneMatchVersion
        Hide
        Hoss Man added a comment -

        Ahmet & Vitaliy - thanks for your tests & improvements!

        Here's a slightly updated patch...

        • introduce a String constant for the "discountOverlaps" param name (since it's now used outside the class in IndexSchema, a constant is appropriate)
        • consolidated the two tests classes into one to reduce some duplication & leverage some existing BaseSimilarityTestCase helper methods
        • change the tests to leverage the Version enum values instead of just version string constants (this will make it easier to find/remove tests refering to 4.6 and 4.7 once those constants are no longer explicitly supported)
        • added some javadocs

        I'll commit soon unless anyone sees any problems?

        Show
        Hoss Man added a comment - Ahmet & Vitaliy - thanks for your tests & improvements! Here's a slightly updated patch... introduce a String constant for the "discountOverlaps" param name (since it's now used outside the class in IndexSchema, a constant is appropriate) consolidated the two tests classes into one to reduce some duplication & leverage some existing BaseSimilarityTestCase helper methods change the tests to leverage the Version enum values instead of just version string constants (this will make it easier to find/remove tests refering to 4.6 and 4.7 once those constants are no longer explicitly supported) added some javadocs I'll commit soon unless anyone sees any problems?
        Hide
        ASF subversion and git services added a comment -

        Commit 1566842 from hossman@apache.org in branch 'dev/trunk'
        [ https://svn.apache.org/r1566842 ]

        SOLR-5561: Fix implicit DefaultSimilarityFactory initialization in IndexSchema to properly specify discountOverlap option

        Show
        ASF subversion and git services added a comment - Commit 1566842 from hossman@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1566842 ] SOLR-5561 : Fix implicit DefaultSimilarityFactory initialization in IndexSchema to properly specify discountOverlap option
        Hide
        ASF subversion and git services added a comment -

        Commit 1566871 from hossman@apache.org in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1566871 ]

        SOLR-5561: Fix implicit DefaultSimilarityFactory initialization in IndexSchema to properly specify discountOverlap option (merge r1566842)

        Show
        ASF subversion and git services added a comment - Commit 1566871 from hossman@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1566871 ] SOLR-5561 : Fix implicit DefaultSimilarityFactory initialization in IndexSchema to properly specify discountOverlap option (merge r1566842)
        Hide
        Hoss Man added a comment -

        Thanks everyone!

        Show
        Hoss Man added a comment - Thanks everyone!

          People

          • Assignee:
            Hoss Man
            Reporter:
            Isaac Hebsh
          • Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development