Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-1087

MultiSearcher.explain returns incorrect score/explanation relating to docFreq

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 2.2
    • 2.9
    • core/query/scoring
    • None
    • No special hardware required to reproduce the issue.

    • New

    Description

      Creating 2 different indexes, searching each individually and print score details and compare to searching both indexes with MulitSearcher and printing score details.

      The "docFreq" value printed isn't correct - the values it prints are as if each index was searched individually.

      Code is like:

      MultiSearcher multi = new MultiSearcher(searchables);
      Hits hits = multi.search(query);
      for(int i=0; i<hits.length(); i++)
      {
        Explanation expl = multi.explain(query, hits.id(i));
        System.out.println(expl.toString());
      }
      

      I raised this in the Lucene user mailing list and was advised to log a bug, email thread given below.

       
      -----Original Message-----
      From: Chris Hostetter  
      Sent: Friday, December 07, 2007 10:30 PM
      To: java-user
      Subject: Re: does the MultiSearcher class calculate IDF properly?
      
      
      a quick glance at the code seems to indicate that MultiSearcher has code 
      for calcuating the docFreq accross all of the Searchables when searching 
      (or when the docFreq method is explicitly called) but that explain method 
      just delegates to Searchable that the specific docid came from.
      
      if you compare that Explanation score you got with the score returned by 
      a HitCollector (or TopDocs) they probably won't match.
      
      So i would say "yes MultiSearcher calculates IDF properly, but 
      MultiSeracher.explain is broken.  Please file a bug about this, i can't 
      think of an easy way to fix it, but it certianly seems broken to me.
      
      
      : Subject: does the MultiSearcher class calculate IDF properly?
      : 
      : I tried the following.  Creating 2 different indexes, search each
      : individually and print score details and compare to searching both
      : indexes with MulitSearcher and printing score details.  
      : 
      : The "docFreq" value printed don't seem right - is this just a problem
      : with using Explain together with the MultiSearcher?
      : 
      : 
      : Code is like:
      : MultiSearcher multi = new MultiSearcher(searchables);
      : Hits hits = multi.search(query);
      : for(int i=0; i<hits.length(); i++)
      : {
      :   Explanation expl = multi.explain(query, hits.id(i));
      :   System.out.println(expl.toString());
      : }
      : 
      : 
      : Output:
      : id = 14 score = 0.071
      : 0.07073946 = (MATCH) fieldWeight(contents:climate in 2), product of:
      :   1.0 = tf(termFreq(contents:climate)=1)
      :   1.8109303 = idf(docFreq=1)
      :   0.0390625 = fieldNorm(field=contents, doc=2)
      

      Attachments

        Issue Links

          Activity

            People

              markrmiller@gmail.com Mark Miller
              yasoja Yasoja Seneviratne
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: