Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-321

misleading comment about spellchecker's termSourceField in solrconfig.xml

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • documentation
    • None

    Description

      The config file comment says this about "termSourceField":

      "the field in your schema that you want to be able to build
      your spell index on. This should be a field that uses a very
      simple FieldType without a lot of Analysis (ie: string)"

      I think this is wrong or at least misleading: the Lucene spellchecker uses a TermEnum to access the terms of this field, so the only requirement is that the field needs to be indexed. Isn't the common usecase of the spellchecker to use all of your terms in e.g. "body" as candidates for spellchecking? Then the field given for termSourceField should be e.g. "body", which is usually indexed and tokenized.

      Of course, if you want "new yorc" to be corrected to "new york" this won't work with a tokenized field. I suggest this text for the comment:

      The field in your schema that you want to be able to build your spell index on. This must be a field that is indexed. If it is of type "text" all the terms in that field will be used as separate candidates for spellchecking, if it is of type "string" the complete content of that field is considered a single term. This might me useful if you have a field whose only content is something like 'new york' and the text you want to have spell checked is 'new yrok'.

      (besied that, spellchecking more than one term doesn't seem to be supported, I'll see if I add a comment about that to the wiki)

      Attachments

        Activity

          People

            Unassigned Unassigned
            lucenebugs@danielnaber.de Daniel Naber
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: