Description
The current synonymsfilter uses a lot of ram and cpu, especially at build time.
I think yesterday I heard about "huge synonyms files" three times.
So, I think we should use an FST-based structure, sharing the inputs and outputs.
And we should be more efficient with the tokenStream api, e.g. using save/restoreState instead of cloneAttributes()
Attachments
Attachments
Issue Links
- is duplicated by
-
SOLR-2628 use of FST for SynonymsFilterFactory and synonyms.txt
- Resolved
- supercedes
-
LUCENE-2347 Dump WordNet to SOLR Synonym format
- Resolved