[SOLR-2211] Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: 3.1
Fix Version/s: 3.1, 4.0-ALPHA
Component/s: None
Labels:
None

Description

The Lucene 3.x StandardTokenizer with UAX#29 support provides benefits for non-English tokenizing. Presently it can be invoked by using the StandardTokenizerFactory and setting the Version to 3.1. However, it would be useful to be able to use the improved unicode processing without necessarily including the ip address and email address processing of StandardAnalyzer. A FilterFactory that allowed the use of the StandardTokenizer with UAX#29 support on its own would be useful.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

SOLR-2211.patch
08/Nov/10 22:22
6 kB
Tom Burton-West

Issue Links

is related to

LUCENE-2763 Swap URL+Email recognizing StandardTokenizer and UAX29Tokenizer

Closed

Activity

People

Assignee:: Robert Muir

Reporter:: Tom Burton-West

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 01/Nov/10 17:30

Updated:: 30/Mar/11 15:45

Resolved:: 08/Nov/10 22:51