[LUCENE-3233] HuperDuperSynonymsFilter™ - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.4, 4.0-ALPHA
Component/s: None
Labels:
None

Lucene Fields:

New

Description

The current synonymsfilter uses a lot of ram and cpu, especially at build time.

I think yesterday I heard about "huge synonyms files" three times.

So, I think we should use an FST-based structure, sharing the inputs and outputs.
And we should be more efficient with the tokenStream api, e.g. using save/restoreState instead of cloneAttributes()

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

synonyms.zip
06/Jul/11 06:51
575 kB
Robert Muir
LUCENE-3233.patch
23/Jun/11 14:04
16 kB
Robert Muir
LUCENE-3233.patch
04/Jul/11 17:42
48 kB
Michael McCandless
LUCENE-3233.patch
05/Jul/11 13:25
44 kB
Robert Muir
LUCENE-3233.patch
05/Jul/11 19:49
52 kB
Michael McCandless
LUCENE-3233.patch
05/Jul/11 21:46
54 kB
Michael McCandless
LUCENE-3233.patch
06/Jul/11 05:25
73 kB
Robert Muir
LUCENE-3233.patch
06/Jul/11 06:49
78 kB
Robert Muir
LUCENE-3233.patch
06/Jul/11 17:02
80 kB
Robert Muir
LUCENE-3233.patch
06/Jul/11 18:35
83 kB
Robert Muir
LUCENE-3233.patch
06/Jul/11 21:18
89 kB
Michael McCandless
LUCENE-3233.patch
07/Jul/11 01:13
95 kB
Robert Muir
LUCENE-3233.patch
07/Jul/11 13:07
91 kB
Michael McCandless
LUCENE-3233.patch
07/Jul/11 18:54
94 kB
Michael McCandless
LUCENE-3233.patch
09/Jul/11 19:13
252 kB
Robert Muir
LUCENE-3233.patch
10/Jul/11 08:53
252 kB
Robert Muir
LUCENE-3233.patch
10/Jul/11 20:31
248 kB
Michael McCandless
LUCENE-3223.patch
24/Jun/11 10:48
29 kB
Michael McCandless

Issue Links

is duplicated by

SOLR-2628 use of FST for SynonymsFilterFactory and synonyms.txt

Resolved

supercedes

LUCENE-2347 Dump WordNet to SOLR Synonym format

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Robert Muir

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 23/Jun/11 14:00

Updated:: 28/Aug/22 12:50

Resolved:: 28/Aug/11 16:41