Solr
  1. Solr
  2. SOLR-2853

SpellCheckCollator.collate method creates the a PossibilityIterator with maxTries instead of maxCollations

    Details

    • Type: Bug Bug
    • Status: Reopened
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.3, 4.0-ALPHA
    • Fix Version/s: None
    • Component/s: spellchecker
    • Labels:
      None

      Description

      Class SpellCheckCollator creates a new possibility iterator based on the spellcheck results. The iterator is created with:
      PossibilityIterator possibilityIter = new PossibilityIterator(result.getSuggestions(), maxTries, maxEvaluations);

      The issue is maxTries, should be maxCollations. Correct me if I'm wrong.

      1. SOLR-2853.patch
        0.8 kB
        Matt Traynham
      2. SOLR-2853.patch
        2 kB
        James Dyer

        Activity

        Hide
        James Dyer added a comment -

        Updated patch & added a unit test.

        With "spellcheck.maxCollationTries=0&spellcheck.maxCollations=2", the patched version will return 2 collations as requested. Prior to the patch, "spellcheck.maxCollations" is ignored if "spellcheck.maxCollationTries" is zero.

        Show
        James Dyer added a comment - Updated patch & added a unit test. With "spellcheck.maxCollationTries=0&spellcheck.maxCollations=2", the patched version will return 2 collations as requested. Prior to the patch, "spellcheck.maxCollations" is ignored if "spellcheck.maxCollationTries" is zero.
        Hide
        Matt Traynham added a comment -

        Attaching patch

        Show
        Matt Traynham added a comment - Attaching patch
        Hide
        Matt Traynham added a comment -

        James,
        Thanks for looking into this again. I completely agree with your fix and appreciate the 411 on the context between maxTries and maxCollations.

        Matt

        Show
        Matt Traynham added a comment - James, Thanks for looking into this again. I completely agree with your fix and appreciate the 411 on the context between maxTries and maxCollations. Matt
        Hide
        Matt Traynham added a comment -

        Reopened

        Show
        Matt Traynham added a comment - Reopened
        Hide
        James Dyer added a comment -

        Matt,

        Actuall, if you wouldn't mind, could you re-open this one (I can't), or otherwise I can open another issue. Taking a closer look after reading your last comment, there is a bug in the case the user sets "maxCollationTries" to 0. In SpellCheckCollator, we have:

        if (maxTries < 1) {
          maxTries = 1;
          verifyCandidateWithQuery = false;
        }
        

        But I think in this case we need it to set "maxTries" to the same value as "maxCollations". The current code will, as you point out, only return 1 collation no mater how many the user specified, unless "maxTries" > 0. I would think most users who want multiple collations would also want them verified, so this is probably not something that would get easily caught. An extra test case might be prudent as well.

        Show
        James Dyer added a comment - Matt, Actuall, if you wouldn't mind, could you re-open this one (I can't), or otherwise I can open another issue. Taking a closer look after reading your last comment, there is a bug in the case the user sets "maxCollationTries" to 0. In SpellCheckCollator, we have: if (maxTries < 1) { maxTries = 1; verifyCandidateWithQuery = false ; } But I think in this case we need it to set "maxTries" to the same value as "maxCollations". The current code will, as you point out, only return 1 collation no mater how many the user specified, unless "maxTries" > 0. I would think most users who want multiple collations would also want them verified, so this is probably not something that would get easily caught. An extra test case might be prudent as well.
        Hide
        Matt Traynham added a comment -

        Marking as invalid, based upon Mr. Dyer's comment.

        Show
        Matt Traynham added a comment - Marking as invalid, based upon Mr. Dyer's comment.
        Hide
        Matt Traynham added a comment -

        Ahh yes, that makes sense. I was under the assumption that one might want up to a "maxCollations" of possibilities returned, instead of just the first one, even if "maxTries" is zero. But thanks that clears up my issue.

        No problem looking into code, it's very interesting work.

        Show
        Matt Traynham added a comment - Ahh yes, that makes sense. I was under the assumption that one might want up to a "maxCollations" of possibilities returned, instead of just the first one, even if "maxTries" is zero. But thanks that clears up my issue. No problem looking into code, it's very interesting work.
        Hide
        James Dyer added a comment -

        Matt,

        Thanks for taking such a deep dive into this code. Its great to see people checking for things like this. I think the current code is correct, however. What PossibilityIterator is returning is a set of word combinations that SpellCheckCollator then needs to test against the index. So PossibilityIterator will return up to "maxTries" word combinations. But some of these possibilities could be nonsense and will return 0 hits when queried for against the index. SpellCheckCollator will throw these 0-hit possibilities out, trying each possibility until it has as many good ones as requested by "maxCollations", or until it has exhausted the list. (If the user sets maxCollationTries to zero, SpellCheckCollator won't test any and in this case will just return the first "maxCollations" possibilities back to the user.) Make sense?

        Show
        James Dyer added a comment - Matt, Thanks for taking such a deep dive into this code. Its great to see people checking for things like this. I think the current code is correct, however. What PossibilityIterator is returning is a set of word combinations that SpellCheckCollator then needs to test against the index. So PossibilityIterator will return up to "maxTries" word combinations. But some of these possibilities could be nonsense and will return 0 hits when queried for against the index. SpellCheckCollator will throw these 0-hit possibilities out, trying each possibility until it has as many good ones as requested by "maxCollations", or until it has exhausted the list. (If the user sets maxCollationTries to zero, SpellCheckCollator won't test any and in this case will just return the first "maxCollations" possibilities back to the user.) Make sense?

          People

          • Assignee:
            Unassigned
            Reporter:
            Matt Traynham
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development