[SOLR-1423] Lucene 2.9 RC4 may need some changes in Solr Analyzers using CharStream & others - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 1.4
Fix Version/s: 1.4
Component/s: Schema and Analysis
Labels:
None

Description

Because of some backwards compatibility problems (~~LUCENE-1906~~) we changed the CharStream/CharFilter API a little bit. Tokenizer now only has a input field of type java.io.Reader (as before the CharStream code). To correct offsets, it is now needed to call the Tokenizer.correctOffset(int) method, which delegates to the CharStream (if input is subclass of CharStream), else returns an uncorrected offset. Normally it is enough to change all occurences of input.correctOffset() to this.correctOffset() in Tokenizers. It should also be checked, if custom Tokenizers in Solr do correct their offsets.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

SOLR-1423-with-empty-tokens.patch
15/Sep/09 08:58
11 kB
Uwe Schindler
SOLR-1423-fix-empty-tokens.patch
15/Sep/09 10:21
13 kB
Uwe Schindler
SOLR-1423-fix-empty-tokens.patch
15/Sep/09 17:04
14 kB
Uwe Schindler
SOLR-1423-FieldType.patch
14/Sep/09 06:41
0.6 kB
Uwe Schindler
SOLR-1423.patch
14/Sep/09 02:25
7 kB
Koji Sekiguchi
SOLR-1423.patch
14/Sep/09 09:53
8 kB
Uwe Schindler
SOLR-1423.patch
14/Sep/09 23:32
8 kB
Koji Sekiguchi

Issue Links

relates to

LUCENE-1906 Backwards problems with CharStream and Tokenizers with custom reset(Reader) method

Closed

SOLR-1404 Random failures with highlighting

Closed

Activity

People

Assignee:: Koji Sekiguchi

Reporter:: Uwe Schindler

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 11/Sep/09 12:08

Updated:: 10/Nov/09 15:52

Resolved:: 18/Sep/09 07:38