Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-789

Custom similarity is ignored when using MultiSearcher

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.1
    • Fix Version/s: 2.2
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      Patch Available

      Description

      Symptoms:
      I am using Searcher.setSimilarity() to provide a custom similarity that turns off tf() factor. However, somewhere along the way the custom similarity is ignored and the DefaultSimilarity is used. I am using MultiSearcher and BooleanQuery.

      Problem analysis:
      The problem seems to be in MultiSearcher.createWeight(Query) method. It creates an instance of CachedDfSource but does not set the similarity. As the result CachedDfSource provides DefaultSimilarity to queries that use it.

      Potential solution:
      Adding the following line:
      cacheSim.setSimilarity(getSimilarity());
      after creating an instance of CacheDfSource (line 312) seems to fix the problem. However, I don't understand enough of the inner workings of this class to be absolutely sure that this is the right thing to do.

      1. 789_patch.txt
        3 kB
        Doron Cohen
      2. TestMultiSearcherSimilarity.java
        3 kB
        Alexey Lef

        Activity

        Hide
        otis Otis Gospodnetic added a comment -

        Alexey, the best way to start with this, and the way that will help get this fixed in Lucene core is to write a unit test class that does what your code does with MultiSearcher and BooleanQuery, and shows that the test fails when a custom Similarity class is used. You can make that custom Similarity an inner class in your unit test class, to contain everything neatly in a single class.

        Once we see the test failing we cann apply your suggested fix and see if that works, if your previously broken unit test now passes, and if all other unit tests still pass.

        Show
        otis Otis Gospodnetic added a comment - Alexey, the best way to start with this, and the way that will help get this fixed in Lucene core is to write a unit test class that does what your code does with MultiSearcher and BooleanQuery, and shows that the test fails when a custom Similarity class is used. You can make that custom Similarity an inner class in your unit test class, to contain everything neatly in a single class. Once we see the test failing we cann apply your suggested fix and see if that works, if your previously broken unit test now passes, and if all other unit tests still pass.
        Hide
        alexeylef Alexey Lef added a comment -

        Attached unit test

        Show
        alexeylef Alexey Lef added a comment - Attached unit test
        Hide
        doronc Doron Cohen added a comment -

        Thanks for the test case, Alexey!

        Problem was in MultiSearcher.CachedDfSource.
        Attached patch fixes this in MultiSearcher, plus adds the test-case to existing MultiSearcherTest.

        Show
        doronc Doron Cohen added a comment - Thanks for the test case, Alexey! Problem was in MultiSearcher.CachedDfSource. Attached patch fixes this in MultiSearcher, plus adds the test-case to existing MultiSearcherTest.
        Hide
        doronc Doron Cohen added a comment -

        Fix committed, thanks Alexey!

        It should be noted that as before this fix, creating a multiSearcher from Searchers for whom custom similarity was set has no effect - the custom similarities of those searchers are masked by the similarity of the MultiSearcher. This is as designed, because MultiSearcher operates on Searchables (not on Searchers).

        Show
        doronc Doron Cohen added a comment - Fix committed, thanks Alexey! It should be noted that as before this fix, creating a multiSearcher from Searchers for whom custom similarity was set has no effect - the custom similarities of those searchers are masked by the similarity of the MultiSearcher. This is as designed, because MultiSearcher operates on Searchables (not on Searchers).

          People

          • Assignee:
            doronc Doron Cohen
            Reporter:
            alexeylef Alexey Lef
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development