Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-605

Make Explanation include information about match/non-match

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core/search
    • Labels:
      None

      Description

      As discussed, I'm looking into the possibility of improving the Explanation class to include some basic info about the "match" status of the Explanation – independent of the value...

      http://www.nabble.com/BooleanWeight.normalize%28float%29-doesn%27t-normalize-prohibited-clauses--t1596471.html#a4347644

      This is neccesary to deal with things like LUCENE-451

      1. LUCENE-605-fix-with-subclassing.patch
        16 kB
        Hoss Man
      2. demo-subclassing-explanation-approach.patch
        9 kB
        Hoss Man
      3. demo-fix.patch
        9 kB
        Hoss Man

        Issue Links

          Activity

          Hide
          hossman Hoss Man added a comment -

          Changes have been commited.

          I modified the code slightly from the most recent patch, to include a new producted "getSummary()" method in Explanation which is used toString and toHtml, and overridden by ComplexExplanation ... making it unneccessary to cut/paste the toString/toHtml code from Explanation.

          Show
          hossman Hoss Man added a comment - Changes have been commited. I modified the code slightly from the most recent patch, to include a new producted "getSummary()" method in Explanation which is used toString and toHtml, and overridden by ComplexExplanation ... making it unneccessary to cut/paste the toString/toHtml code from Explanation.
          Hide
          hossman Hoss Man added a comment -

          I should have just kept on working on this last night .. it only took about 20 minutes to make the remaining changes neccessary to all of hte other query types.

          (subclassing is *DEFINITELY* the right solution to this problem)

          This latest patch fixes all of the tests LUCENE-451's bq.containing.clause.with.zero.boost.tests.patch

          If no one objects, I'll commit this sometime in the next 24-48 hours.

          Show
          hossman Hoss Man added a comment - I should have just kept on working on this last night .. it only took about 20 minutes to make the remaining changes neccessary to all of hte other query types. (subclassing is * DEFINITELY * the right solution to this problem) This latest patch fixes all of the tests LUCENE-451 's bq.containing.clause.with.zero.boost.tests.patch If no one objects, I'll commit this sometime in the next 24-48 hours.
          Hide
          hossman Hoss Man added a comment -

          patch showing the first steps towards the subclassing approach.

          I definitely think this is a much better way to go.

          Which this new subclass, and some very simple changes to BooleanQuery/TermQuery, several of the bugs in the LUCENE-451 test patch are fixed ... the rest should be a simple matter of making the corrisponding small changes to the various other query types.

          Show
          hossman Hoss Man added a comment - patch showing the first steps towards the subclassing approach. I definitely think this is a much better way to go. Which this new subclass, and some very simple changes to BooleanQuery/TermQuery, several of the bugs in the LUCENE-451 test patch are fixed ... the rest should be a simple matter of making the corrisponding small changes to the various other query types.
          Hide
          hossman Hoss Man added a comment -

          Hmmm... a subclass relationship might make a lot of sense here ... if we add an isMatch() method to the existing "Explanation" which infers it's state from the value, a subclass could allow the matchiness to be explicitly set in cases where it makes sense. This would have the added benefit of reducing the amount of changes needed in all of hte vaiours uses of Explanation rihgt now.

          Making a new MatchExplanation that superclasses Explanation seems trickier .. MatchExplanation would need a setMatch method to be usefull, but we dont' want that method in Explanation (so we'd need to override it to through an unsupported exception orosmething). Personally I don't think having MatchExplanation subclass Explanation would be all that confusion if a new Matcher auperinterface was added above Scorer – there's no direct correllary between a Scorer and an Explanation anyway (that's what started this whole discussion right, that Explanation is frequencly used to model things much smaller then a basic Doc/Score.

          > Sorry for the rant, I should take a bit more time to consider things before typing comments...

          ...wow, I'm more jaded then I thought ... It didn't even occur to me that you were ranting.

          Show
          hossman Hoss Man added a comment - Hmmm... a subclass relationship might make a lot of sense here ... if we add an isMatch() method to the existing "Explanation" which infers it's state from the value, a subclass could allow the matchiness to be explicitly set in cases where it makes sense. This would have the added benefit of reducing the amount of changes needed in all of hte vaiours uses of Explanation rihgt now. Making a new MatchExplanation that superclasses Explanation seems trickier .. MatchExplanation would need a setMatch method to be usefull, but we dont' want that method in Explanation (so we'd need to override it to through an unsupported exception orosmething). Personally I don't think having MatchExplanation subclass Explanation would be all that confusion if a new Matcher auperinterface was added above Scorer – there's no direct correllary between a Scorer and an Explanation anyway (that's what started this whole discussion right, that Explanation is frequencly used to model things much smaller then a basic Doc/Score. > Sorry for the rant, I should take a bit more time to consider things before typing comments... ...wow, I'm more jaded then I thought ... It didn't even occur to me that you were ranting.
          Hide
          paul.elschot@xs4all.nl Paul Elschot added a comment -

          The purpose of Explanation is to explain all the "mysteries" of query search, so it would be worthwhile to use an extra class for the micro explanations. Since these currently are just Explanations without a match, why not make the micro explanations a superclass of Explanation?
          It might even be simpler to make MatchExplanation a subclass of the existing Explanation, and hope that that does not break backwards compatbility later (I don't know.)
          Such a class structure would be counterintuitive to Matcher as superclass/superinterface of Scorer, so maybe it should be the other way round: make MatchExplanation a superclass of Explanation.

          Sorry for the rant, I should take a bit more time to consider things before typing comments...

          Show
          paul.elschot@xs4all.nl Paul Elschot added a comment - The purpose of Explanation is to explain all the "mysteries" of query search, so it would be worthwhile to use an extra class for the micro explanations. Since these currently are just Explanations without a match, why not make the micro explanations a superclass of Explanation? It might even be simpler to make MatchExplanation a subclass of the existing Explanation, and hope that that does not break backwards compatbility later (I don't know.) Such a class structure would be counterintuitive to Matcher as superclass/superinterface of Scorer, so maybe it should be the other way round: make MatchExplanation a superclass of Explanation. Sorry for the rant, I should take a bit more time to consider things before typing comments...
          Hide
          hossman Hoss Man added a comment -

          > The point is that adding by adding a match indicator to Explanation, Explanation becomes less useful
          > to explain a subformula of a (matching) score value, in this case the coordination factor.

          yeah ... when I added "getMatchString()" to the toString() value It hit me that eliminating that other constructor (or more specificly: eliminating Explanations without a "match" state) is really impractical. Many of the "micro explanations" of things like coord value and tf, and idf don't really make sense with a "match" boolean, and even the ones that do don't really need it. But if anything that just reaffirmed my faith in the isMatch() function using the Boolean if it's set, and infering it from the float value if it's not – but eliminating the 2 arg constructor is certainly impractical – so is automatically including the getMatchString() in the toString(int) .. but I am thinking that if/when I change all the Query.explain methods to use the new match Boolean, I'll also update the description to say wether it was a match .. either that or add it to the toString() method (ie: just at the top level without the recursion) ... but that's a secondary isue to getting the logic right.

          Show
          hossman Hoss Man added a comment - > The point is that adding by adding a match indicator to Explanation, Explanation becomes less useful > to explain a subformula of a (matching) score value, in this case the coordination factor. yeah ... when I added "getMatchString()" to the toString() value It hit me that eliminating that other constructor (or more specificly: eliminating Explanations without a "match" state) is really impractical. Many of the "micro explanations" of things like coord value and tf, and idf don't really make sense with a "match" boolean, and even the ones that do don't really need it. But if anything that just reaffirmed my faith in the isMatch() function using the Boolean if it's set, and infering it from the float value if it's not – but eliminating the 2 arg constructor is certainly impractical – so is automatically including the getMatchString() in the toString(int) .. but I am thinking that if/when I change all the Query.explain methods to use the new match Boolean, I'll also update the description to say wether it was a match .. either that or add it to the toString() method (ie: just at the top level without the recursion) ... but that's a secondary isue to getting the logic right.
          Hide
          paul.elschot@xs4all.nl Paul Elschot added a comment -

          I tried removing the Explanation constructor that is deprecated in the demo-fix.patch.
          One of the uses of this constructor is in the (patched) BooleanQuery from line 317,
          and fixed it like this (under ASL 2):

          sumExpl.setMatch(Boolean.TRUE);
          sumExpl.setValue(sum);

          float coordFactor = similarity.coord(coord, maxCoord);
          if (coordFactor != 1.0f)

          { // coordination has effect sumExpl.setValue(sumExpl.getValue() * coordFactor); sumExpl.setDescription(sumExpl.getDescription() + " * " + coordFactor + "=coord("+coord+"/"+maxCoord+")"); }

          return sumExpl;

          The point is that adding by adding a match indicator to Explanation, Explanation becomes less useful
          to explain a subformula of a (matching) score value, in this case the coordination factor.
          The fix is to add the subformula to the description and the value of the explanation.

          Btw. the actual explained score value was not changed by setValue() in the existing code for the coordination factor.
          This is probably a bug in BooleanQuery.explain().

          There seems to be no test for the explanation descriptions, and I did not have a look at the actually produced
          getDescription() of the returned Explanation in this case.

          Show
          paul.elschot@xs4all.nl Paul Elschot added a comment - I tried removing the Explanation constructor that is deprecated in the demo-fix.patch. One of the uses of this constructor is in the (patched) BooleanQuery from line 317, and fixed it like this (under ASL 2): sumExpl.setMatch(Boolean.TRUE); sumExpl.setValue(sum); float coordFactor = similarity.coord(coord, maxCoord); if (coordFactor != 1.0f) { // coordination has effect sumExpl.setValue(sumExpl.getValue() * coordFactor); sumExpl.setDescription(sumExpl.getDescription() + " * " + coordFactor + "=coord("+coord+"/"+maxCoord+")"); } return sumExpl; The point is that adding by adding a match indicator to Explanation, Explanation becomes less useful to explain a subformula of a (matching) score value, in this case the coordination factor. The fix is to add the subformula to the description and the value of the explanation. Btw. the actual explained score value was not changed by setValue() in the existing code for the coordination factor. This is probably a bug in BooleanQuery.explain(). There seems to be no test for the explanation descriptions, and I did not have a look at the actually produced getDescription() of the returned Explanation in this case.
          Hide
          paul.elschot@xs4all.nl Paul Elschot added a comment -

          I like the Boolean for indicating the match.
          The demo-fix.patch applies cleanly on my working copy, and all tests pass with it.
          I'll keep the patch in my working copy for now.

          Regards,
          Paul Elschot

          Show
          paul.elschot@xs4all.nl Paul Elschot added a comment - I like the Boolean for indicating the match. The demo-fix.patch applies cleanly on my working copy, and all tests pass with it. I'll keep the patch in my working copy for now. Regards, Paul Elschot
          Hide
          hossman Hoss Man added a comment -

          Demo of the basic direction I'm going. This patch inlcudes some changes to the Explanation class to include the new information, as well as some tweaks to TermQuery and BooleanQuery to take advantage of it.

          NOTE: the BooleanQuery changes in this patch overlap with he patches in LUCENE-557

          Show
          hossman Hoss Man added a comment - Demo of the basic direction I'm going. This patch inlcudes some changes to the Explanation class to include the new information, as well as some tweaks to TermQuery and BooleanQuery to take advantage of it. NOTE: the BooleanQuery changes in this patch overlap with he patches in LUCENE-557

            People

            • Assignee:
              hossman Hoss Man
              Reporter:
              hossman Hoss Man
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development