[SOLR-7193] Concatenate words from token stream - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: None
Component/s: Schema and Analysis
Labels:
None

Description

The user entered data often don't have proper spacing between words and words spelling and format also varies from data like business names, address etc. After tokenizing data, we might perform pattern replacement, stop word filtering etc. Later we want to concatenate all the tokens and generate n-grams token for indexing business name and perform the fuzzy match.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

concatenate_words.patch
05/Mar/15 06:00
11 kB
Abhishek Bafna

Issue Links

links to

org.opensextant.solrtexttagger.ConcatenateFilterFactory

Activity

People

Assignee:: Unassigned

Reporter:: Abhishek Bafna

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 05/Mar/15 05:47

Updated:: 13/Jun/18 16:24

Resolved:: 13/Jun/18 16:24