[LUCENE-760] Spellchecker could/should use n-gram tokenizers instead of rolling its own n-gramming - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
Component/s: modules/analysis
Labels:
None

Lucene Fields:

New

Description

The SpellChecker.java under contrib/spellchecker currently does its own creation of n-grams while it creates the index to search for alternative spelling suggestions, and then it again creates appropriate n-grams when it receives a query string/word to lookup alternative spelling suggestions for. Very clear sentence, I know.

I think it might be better if n-gram chomping could be outsourced to n-gram tokenizers that just made their way into contrib/analyzers via ~~LUCENE-759~~.

If I see nods or if I don't get any nays I'll go and refactor SpellChecker.java a little bit to allow this.
SpellChecker has a page on the Wiki: http://wiki.apache.org/jakarta-lucene/SpellChecker

Thoughts?

Attachments

Activity

People

Assignee:: Otis Gospodnetic

Reporter:: Otis Gospodnetic

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 22/Dec/06 23:49

Updated:: 28/Aug/22 11:33

Resolved:: 14/May/08 06:13