Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-8057 Change default Sim to BM25 (w/backcompat config handling)
  3. SOLR-8270

Make BM25SimFactory the implicit default when no sim is configured for luceneMatch > 6.0

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.0
    • Component/s: None
    • Labels:
      None

      Description

      As discussed in parent issue, when the luceneMatchVersion is >= 6.0, IndexSearcher should use BM25SimilarityFactory as the implicit default if no explicit default is configured.

        Issue Links

          Activity

          Hide
          hossman Hoss Man added a comment -

          Patch so far, still vetting/testing...

          • IndexSchema
            • checks luceneMatchVersion to decide what implicit SimFactory to use
          • TestNonDefinedSimilarityFactory
            • existing test now asserts schema-tiny.xml will get BM25Similarity
            • new test method that sets luceneMatchVersion before initing core and asserts ClassicSimilarity for same schema
          • Misc tests updated to account for new BM25 default behavior...
            • TestGroupingSearch
            • QueryElevationComponentTest
            • StatsComponentTest
            • TestSchemaSimilarityResource
            • ChangedSchemaMergeTest
            • TestExtendedDismaxParser
            • TestReRankQParserPlugin
            • TestSolrQueryParser
            • SchemaTest
          • TestFunctionQuery
            • refactored so that tf/idf function could be tested against a field that explicitly used ClassicSim (these valuesources require TFIDFSimilarity)
              • tweaked schema11.xml to have a ne fieldType with the needed sim for this new refactored test
            • one other small tweak needed to testGeneral to account for new BM25 lenghtNorm behavior
          Show
          hossman Hoss Man added a comment - Patch so far, still vetting/testing... IndexSchema checks luceneMatchVersion to decide what implicit SimFactory to use TestNonDefinedSimilarityFactory existing test now asserts schema-tiny.xml will get BM25Similarity new test method that sets luceneMatchVersion before initing core and asserts ClassicSimilarity for same schema Misc tests updated to account for new BM25 default behavior... TestGroupingSearch QueryElevationComponentTest StatsComponentTest TestSchemaSimilarityResource ChangedSchemaMergeTest TestExtendedDismaxParser TestReRankQParserPlugin TestSolrQueryParser SchemaTest TestFunctionQuery refactored so that tf/idf function could be tested against a field that explicitly used ClassicSim (these valuesources require TFIDFSimilarity) tweaked schema11.xml to have a ne fieldType with the needed sim for this new refactored test one other small tweak needed to testGeneral to account for new BM25 lenghtNorm behavior
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1713902 from hossman@apache.org in branch 'dev/trunk'
          [ https://svn.apache.org/r1713902 ]

          SOLR-8270: Change implicit default Similarity to use BM25 when luceneMatchVersion >= 6

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1713902 from hossman@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1713902 ] SOLR-8270 : Change implicit default Similarity to use BM25 when luceneMatchVersion >= 6

            People

            • Assignee:
              hossman Hoss Man
              Reporter:
              hossman Hoss Man
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development