Solr
  1. Solr
  2. SOLR-3758

SpellCheckComponent doesn't work when using group.

    Details

      Description

      It seems like spellchecker using solr.DirectSolrSpellChecker doesn't work when grouping results.

      /select?q=mispeled
      Gives me correct spellingsuggestions

      but..
      /select?q=mispeled&group=true&group.main=true&group.field=title
      don't give any suggestions.

      It worked in 3.5 with indexbased spellchecker.

      It seems like if i mispell something that returns 0 results i dont get any suggestions. If i misspell something that genereate a result i get suggestions on it.
      It should come up with proper suggestions even if there are no results to be displayed (But there is things that should be suggested).
      Long story short. Same functionality as in 3.5

      1. SOLR-3758.patch
        10 kB
        James Dyer
      2. SOLR-3758.patch
        12 kB
        James Dyer

        Activity

        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.
        Hide
        James Dyer added a comment -

        trunk r1463219
        branch 4x r1463220

        Show
        James Dyer added a comment - trunk r1463219 branch 4x r1463220
        Hide
        James Dyer added a comment -

        Here's an updated patch with all tests passing and nocommits removed. I also reorganized DistributedSpellCheckComponentTest to make it more readable.

        I plan to commit this in a few days.

        Show
        James Dyer added a comment - Here's an updated patch with all tests passing and nocommits removed. I also reorganized DistributedSpellCheckComponentTest to make it more readable. I plan to commit this in a few days.
        Hide
        James Dyer added a comment -

        Here's a patch that changes SpellCheckComponent#modifyRequest to do its work when the request purpose is ShardRequest.PURPOSE_GET_TOP_GROUPS rather than ShardRequest.PURPOSE_GET_TOP_IDS when grouping is enabled. On grouped requests, intra-shard requests with ShardRequest.PURPOSE_GET_TOP_IDS are only sent to shards that contain Top Groups, so SpellCheckComponent wouldn't have the opportunity to find all of its suggestions in a grouped scenario.

        There is also a change to DistributedSpellCheckComponentTest to randomly enable grouping for each test scenario. However, we still get failures related to the "correctlySpelled" flag and with the use of "alternativeTermCount". So we still need to investigate why distributed vs non-dist return different results in these cases.

        This patch I think is complete enough for evaluation purposes.

        Show
        James Dyer added a comment - Here's a patch that changes SpellCheckComponent#modifyRequest to do its work when the request purpose is ShardRequest.PURPOSE_GET_TOP_GROUPS rather than ShardRequest.PURPOSE_GET_TOP_IDS when grouping is enabled. On grouped requests, intra-shard requests with ShardRequest.PURPOSE_GET_TOP_IDS are only sent to shards that contain Top Groups, so SpellCheckComponent wouldn't have the opportunity to find all of its suggestions in a grouped scenario. There is also a change to DistributedSpellCheckComponentTest to randomly enable grouping for each test scenario. However, we still get failures related to the "correctlySpelled" flag and with the use of "alternativeTermCount". So we still need to investigate why distributed vs non-dist return different results in these cases. This patch I think is complete enough for evaluation purposes.
        Hide
        Alexander Kingson added a comment - - edited

        Hi,

        Commenting out
        if (!params.getBool(COMPONENT_NAME, false) || spellCheckers.isEmpty())

        { return; }

        in SpellCheckComponent#process()

        solves the issue, because when group=true params.getBool(COMPONENT_NAME, false) is false.

        Which part constructs this params variable?

        Thanks.
        Alex.

        Show
        Alexander Kingson added a comment - - edited Hi, Commenting out if (!params.getBool(COMPONENT_NAME, false) || spellCheckers.isEmpty()) { return; } in SpellCheckComponent#process() solves the issue, because when group=true params.getBool(COMPONENT_NAME, false) is false. Which part constructs this params variable? Thanks. Alex.
        Hide
        James Dyer added a comment -

        I did looked a little at this and it seems that when "group=true", the first stage request doesn't reach all the shards. For the case I was testing, with 2 shards, only 1 shard would get the request. This would make the spellchecker work some of the time and fail others. I haven't figured out for sure why this happens though. Possibly the grouping logic short-circuits and doesn't bother requesting to shards that are known not to contain the groups that will be returned?

        Show
        James Dyer added a comment - I did looked a little at this and it seems that when "group=true", the first stage request doesn't reach all the shards. For the case I was testing, with 2 shards, only 1 shard would get the request. This would make the spellchecker work some of the time and fail others. I haven't figured out for sure why this happens though. Possibly the grouping logic short-circuits and doesn't bother requesting to shards that are known not to contain the groups that will be returned?
        Hide
        James Dyer added a comment -

        Use case from user list:

        From: alxsss@aim.com alxsss@aim.com
        Sent: Friday, March 22, 2013 12:53 PM
        To: solr-user@lucene.apache.org
        Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

        Hello,

        Further investigation shows the following pattern, for both DirectIndex and wordbreak spellchekers.

        Assume that in all cases there are spellchecker results when distrib=false

        In distributed mode (distrib=true)
        case when matches=0
        1. group=true, no spellcheck results

        2. group=false , there are spellcheck results

        case when matches>0
        1. group=true, there are spellcheck results
        2. group =false, there are spellcheck results

        Do these constitute a failing test case?

        Thanks.
        Alex.

        ----Original Message----
        From: alxsss <alxsss@aim.com>
        To: solr-user <solr-user@lucene.apache.org>
        Sent: Thu, Mar 21, 2013 6:50 pm
        Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud

        Hello,

        I am debugging the SpellCheckComponent#finishStage.

        From the responses I see that not only wordbreak, but also directSpellchecker
        does not return some results in distributed mode.
        The request handler I was using had

        <str name="group">true</str>

        So, I desided to turn of grouping and I see spellcheck results in distributed
        mode.

        curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler'
        has no spellchek results
        but

        curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler
        &group=false'
        returns results.

        So, the conclusion is that grouping causes the distributed spellcheker to fail.

        Could please you point me to the class that may be responsible to this issue?

        Thanks.
        Alex.

        Show
        James Dyer added a comment - Use case from user list: From: alxsss@aim.com alxsss@aim.com Sent: Friday, March 22, 2013 12:53 PM To: solr-user@lucene.apache.org Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud Hello, Further investigation shows the following pattern, for both DirectIndex and wordbreak spellchekers. Assume that in all cases there are spellchecker results when distrib=false In distributed mode (distrib=true) case when matches=0 1. group=true, no spellcheck results 2. group=false , there are spellcheck results case when matches>0 1. group=true, there are spellcheck results 2. group =false, there are spellcheck results Do these constitute a failing test case? Thanks. Alex. ---- Original Message ---- From: alxsss <alxsss@aim.com> To: solr-user <solr-user@lucene.apache.org> Sent: Thu, Mar 21, 2013 6:50 pm Subject: Re: strange behaviour of wordbreak spellchecker in solr cloud Hello, I am debugging the SpellCheckComponent#finishStage. From the responses I see that not only wordbreak, but also directSpellchecker does not return some results in distributed mode. The request handler I was using had <str name="group">true</str> So, I desided to turn of grouping and I see spellcheck results in distributed mode. curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler' has no spellchek results but curl 'server1:8983/solr/test/testhandler?q=paulusoles&indent=true&rows=10&shards.qt=testhandler &group=false' returns results. So, the conclusion is that grouping causes the distributed spellcheker to fail. Could please you point me to the class that may be responsible to this issue? Thanks. Alex.

          People

          • Assignee:
            James Dyer
            Reporter:
            Christian Johnsson
          • Votes:
            2 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development