[LUCENE-4063] FrenchLightStemmer performs abusive compression of (arbitrary) repeated characters in long tokens - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: 3.4, 4.0-ALPHA
Fix Version/s: 4.0-ALPHA
Component/s: modules/analysis
Labels:
None

Lucene Fields:

New, Patch Available

Description

FrenchLightStemmer performs aggressive deletions on repeated character sequences, even on numbers.
This might be unexpected during full text search.

Attachments

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

SOLR-3463.patch
16/May/12 16:08
2 kB
Tanguy Moal
SOLR-3463.patch
16/May/12 16:40
2 kB
Tanguy Moal
SOLR-3463.patch
16/May/12 18:28
2 kB
Tanguy Moal
LUCENE-4063.patch
16/May/12 19:59
3 kB
Steven Rowe

Activity

People

Assignee:: Steven Rowe

Reporter:: Tanguy Moal

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 16/May/12 15:35

Updated:: 28/Aug/22 13:17

Resolved:: 18/May/12 16:46