Solr
  1. Solr
  2. SOLR-2853

SpellCheckCollator.collate method creates the a PossibilityIterator with maxTries instead of maxCollations

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 3.3, 4.0-ALPHA
    • Fix Version/s: 4.10, Trunk
    • Component/s: spellchecker
    • Labels:
      None

      Description

      Class SpellCheckCollator creates a new possibility iterator based on the spellcheck results. The iterator is created with:
      PossibilityIterator possibilityIter = new PossibilityIterator(result.getSuggestions(), maxTries, maxEvaluations);

      The issue is maxTries, should be maxCollations. Correct me if I'm wrong.

      1. SOLR-2853.patch
        2 kB
        James Dyer
      2. SOLR-2853.patch
        0.8 kB
        Matt Traynham

        Activity

        Matt Traynham created issue -
        Hide
        James Dyer added a comment -

        Matt,

        Thanks for taking such a deep dive into this code. Its great to see people checking for things like this. I think the current code is correct, however. What PossibilityIterator is returning is a set of word combinations that SpellCheckCollator then needs to test against the index. So PossibilityIterator will return up to "maxTries" word combinations. But some of these possibilities could be nonsense and will return 0 hits when queried for against the index. SpellCheckCollator will throw these 0-hit possibilities out, trying each possibility until it has as many good ones as requested by "maxCollations", or until it has exhausted the list. (If the user sets maxCollationTries to zero, SpellCheckCollator won't test any and in this case will just return the first "maxCollations" possibilities back to the user.) Make sense?

        Show
        James Dyer added a comment - Matt, Thanks for taking such a deep dive into this code. Its great to see people checking for things like this. I think the current code is correct, however. What PossibilityIterator is returning is a set of word combinations that SpellCheckCollator then needs to test against the index. So PossibilityIterator will return up to "maxTries" word combinations. But some of these possibilities could be nonsense and will return 0 hits when queried for against the index. SpellCheckCollator will throw these 0-hit possibilities out, trying each possibility until it has as many good ones as requested by "maxCollations", or until it has exhausted the list. (If the user sets maxCollationTries to zero, SpellCheckCollator won't test any and in this case will just return the first "maxCollations" possibilities back to the user.) Make sense?
        Hide
        Matt Traynham added a comment -

        Ahh yes, that makes sense. I was under the assumption that one might want up to a "maxCollations" of possibilities returned, instead of just the first one, even if "maxTries" is zero. But thanks that clears up my issue.

        No problem looking into code, it's very interesting work.

        Show
        Matt Traynham added a comment - Ahh yes, that makes sense. I was under the assumption that one might want up to a "maxCollations" of possibilities returned, instead of just the first one, even if "maxTries" is zero. But thanks that clears up my issue. No problem looking into code, it's very interesting work.
        Hide
        Matt Traynham added a comment -

        Marking as invalid, based upon Mr. Dyer's comment.

        Show
        Matt Traynham added a comment - Marking as invalid, based upon Mr. Dyer's comment.
        Matt Traynham made changes -
        Field Original Value New Value
        Status Open [ 1 ] Closed [ 6 ]
        Resolution Invalid [ 6 ]
        Hide
        James Dyer added a comment -

        Matt,

        Actuall, if you wouldn't mind, could you re-open this one (I can't), or otherwise I can open another issue. Taking a closer look after reading your last comment, there is a bug in the case the user sets "maxCollationTries" to 0. In SpellCheckCollator, we have:

        if (maxTries < 1) {
          maxTries = 1;
          verifyCandidateWithQuery = false;
        }
        

        But I think in this case we need it to set "maxTries" to the same value as "maxCollations". The current code will, as you point out, only return 1 collation no mater how many the user specified, unless "maxTries" > 0. I would think most users who want multiple collations would also want them verified, so this is probably not something that would get easily caught. An extra test case might be prudent as well.

        Show
        James Dyer added a comment - Matt, Actuall, if you wouldn't mind, could you re-open this one (I can't), or otherwise I can open another issue. Taking a closer look after reading your last comment, there is a bug in the case the user sets "maxCollationTries" to 0. In SpellCheckCollator, we have: if (maxTries < 1) { maxTries = 1; verifyCandidateWithQuery = false ; } But I think in this case we need it to set "maxTries" to the same value as "maxCollations". The current code will, as you point out, only return 1 collation no mater how many the user specified, unless "maxTries" > 0. I would think most users who want multiple collations would also want them verified, so this is probably not something that would get easily caught. An extra test case might be prudent as well.
        Hide
        Matt Traynham added a comment -

        Reopened

        Show
        Matt Traynham added a comment - Reopened
        Matt Traynham made changes -
        Resolution Invalid [ 6 ]
        Status Closed [ 6 ] Reopened [ 4 ]
        Hide
        Matt Traynham added a comment -

        James,
        Thanks for looking into this again. I completely agree with your fix and appreciate the 411 on the context between maxTries and maxCollations.

        Matt

        Show
        Matt Traynham added a comment - James, Thanks for looking into this again. I completely agree with your fix and appreciate the 411 on the context between maxTries and maxCollations. Matt
        Hide
        Matt Traynham added a comment -

        Attaching patch

        Show
        Matt Traynham added a comment - Attaching patch
        Matt Traynham made changes -
        Attachment SOLR-2853.patch [ 12500921 ]
        Hide
        James Dyer added a comment -

        Updated patch & added a unit test.

        With "spellcheck.maxCollationTries=0&spellcheck.maxCollations=2", the patched version will return 2 collations as requested. Prior to the patch, "spellcheck.maxCollations" is ignored if "spellcheck.maxCollationTries" is zero.

        Show
        James Dyer added a comment - Updated patch & added a unit test. With "spellcheck.maxCollationTries=0&spellcheck.maxCollations=2", the patched version will return 2 collations as requested. Prior to the patch, "spellcheck.maxCollations" is ignored if "spellcheck.maxCollationTries" is zero.
        James Dyer made changes -
        Attachment SOLR-2853.patch [ 12504072 ]
        Hide
        Shalin Shekhar Mangar added a comment -

        Hi James Dyer, is there a reason why this patch was not committed?

        Show
        Shalin Shekhar Mangar added a comment - Hi James Dyer , is there a reason why this patch was not committed?
        Hide
        James Dyer added a comment -

        Shalin, Probably back in 2011 I got sidetracked doing something else and never simply forgot! I can look into verifying the problem still exists and that the patch works in the next day or so, unless you want to.

        Show
        James Dyer added a comment - Shalin, Probably back in 2011 I got sidetracked doing something else and never simply forgot! I can look into verifying the problem still exists and that the patch works in the next day or so, unless you want to.
        Hide
        Shalin Shekhar Mangar added a comment -

        Thanks James, I figured as much. Please do if you have the time. Otherwise I'll look at it by the end of the week. We need to do something about SOLR-3029 as well.

        Show
        Shalin Shekhar Mangar added a comment - Thanks James, I figured as much. Please do if you have the time. Otherwise I'll look at it by the end of the week. We need to do something about SOLR-3029 as well.
        Hide
        James Dyer added a comment -

        This issue actually is fixed already. Before closing, I will commit the unit test and backport to 4.x/

        Show
        James Dyer added a comment - This issue actually is fixed already. Before closing, I will commit the unit test and backport to 4.x/
        Hide
        ASF subversion and git services added a comment -

        Commit 1609171 from jdyer@apache.org in branch 'dev/trunk'
        [ https://svn.apache.org/r1609171 ]

        SOLR-2853: unit test for "spellcheck.maxCollationTries=0"

        Show
        ASF subversion and git services added a comment - Commit 1609171 from jdyer@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1609171 ] SOLR-2853 : unit test for "spellcheck.maxCollationTries=0"
        Hide
        ASF subversion and git services added a comment -

        Commit 1609173 from jdyer@apache.org in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1609173 ]

        SOLR-2853: unit test for "spellcheck.maxCollationTries=0"

        Show
        ASF subversion and git services added a comment - Commit 1609173 from jdyer@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1609173 ] SOLR-2853 : unit test for "spellcheck.maxCollationTries=0"
        Hide
        James Dyer added a comment -

        Thanks for reporting this issue, Matt!

        Show
        James Dyer added a comment - Thanks for reporting this issue, Matt!
        James Dyer made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Assignee James Dyer [ jdyer ]
        Fix Version/s 5.0 [ 12321664 ]
        Fix Version/s 4.10 [ 12327122 ]
        Resolution Fixed [ 1 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Closed Closed
        2h 1 Matt Traynham 26/Oct/11 00:19
        Closed Closed Reopened Reopened
        16h 12m 1 Matt Traynham 26/Oct/11 16:32
        Reopened Reopened Resolved Resolved
        986d 22h 52m 1 James Dyer 09/Jul/14 15:25

          People

          • Assignee:
            James Dyer
            Reporter:
            Matt Traynham
          • Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development