Solr
  1. Solr
  2. SOLR-2083

Problem with Distributed SpellCheck

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Component/s: spellchecker
    • Labels:
      None

      Description

      In DistributedSpellCheckTest, if I add 10 additional documents to the index with field "lowerfilt" containing "The quack red fox jumped over the lazy brown dogs.", then the shard'ed SpellCheckComponent wants to correct "quick" to "quack". The control, non-shared'ed component correctly does not try to correct "quick". The test subsequently fails.

      1. SOLR-2083.patch
        13 kB
        James Dyer
      2. SOLR-2083.patch
        1 kB
        James Dyer

        Issue Links

          Activity

          Hide
          James Dyer added a comment -

          This patch demos the problem.

          Show
          James Dyer added a comment - This patch demos the problem.
          Hide
          James Dyer added a comment -

          I think there are 2 related problems here:

          1. DistributedSpellCheckComponentTest.java runs the test 4 times: first with 1 shard, then with 2 shards, etc. In between iterations, it does not clear the Jetty data directories so the first shard from the 2-shard iteration has all the data from the 1-shard iteration, etc. I can work around this by adding "del(":");" as the first line in doTest(). Unfortunately doing this makes the test fail. I think the problem with the Tester is masking a failing test.

          2. The Component ought to report a word as misspelled if ALL of the shards report it as not in the dictionary. However, the current implementation returns a word as misspelled if ANY shard reports it as not in the dictionary.

          This second patch version resolves the second issue. The problem with the test may warrant its own issue. (I used the workaround here).

          Show
          James Dyer added a comment - I think there are 2 related problems here: 1. DistributedSpellCheckComponentTest.java runs the test 4 times: first with 1 shard, then with 2 shards, etc. In between iterations, it does not clear the Jetty data directories so the first shard from the 2-shard iteration has all the data from the 1-shard iteration, etc. I can work around this by adding "del(" : ");" as the first line in doTest(). Unfortunately doing this makes the test fail. I think the problem with the Tester is masking a failing test. 2. The Component ought to report a word as misspelled if ALL of the shards report it as not in the dictionary. However, the current implementation returns a word as misspelled if ANY shard reports it as not in the dictionary. This second patch version resolves the second issue. The problem with the test may warrant its own issue. (I used the workaround here).
          Hide
          Grant Ingersoll added a comment -

          Committed. Thanks James!

          Show
          Grant Ingersoll added a comment - Committed. Thanks James!
          Hide
          Grant Ingersoll added a comment -

          Bulk close for 3.1.0 release

          Show
          Grant Ingersoll added a comment - Bulk close for 3.1.0 release

            People

            • Assignee:
              Grant Ingersoll
              Reporter:
              James Dyer
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development