[LUCENE-5468] Hunspell very high memory use when loading dictionary - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: 3.5
Fix Version/s: 4.8, 6.0
Component/s: None
Labels:
None

Lucene Fields:

New

Description

Hunspell stemmer requires gigantic (for the task) amounts of memory to load dictionary/rules files.
For example loading a 4.5 MB polish dictionary (with empty index!) will cause whole core to crash with various out of memory errors unless you set max heap size close to 2GB or more.
By comparison Stempel using the same dictionary file works just fine with 1/8 of that (and possibly lower values as well).

Sample error log entries:
http://pastebin.com/fSrdd5W1
http://pastebin.com/Lmi0re7Z

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-5468.patch
27/Feb/14 20:48
239 kB
Robert Muir
patch.txt
15/Dec/11 20:01
15 kB
Robert Muir

Activity

People

Assignee:: Unassigned

Reporter:: Maciej Lisiewski

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 14/Dec/11 02:51

Updated:: 28/Aug/22 14:01

Resolved:: 27/Feb/14 22:52