Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
-
New
Description
Here I introduce the ConcatenateFilter (with Factory) to concatenate/join tokens with a provided separator to produce one final token. It's similar to FingerprintFilter but doesn't deduplicate or sort. It's useful for doing exact-ish search on short text (think names or titles) with simple analysis. At this task, its faster than a PhraseQuery equivalent, and solves the issue of matching completely and not a portion of the tokens. It's also useful for using Lucene to hold a dictionary of short names/phrases for entity-extraction (aka text tagging). The OpenSextant SolrTextTagger uses it for this purpose, which is where I'm taking it from.
Attachments
Attachments
Issue Links
- duplicates
-
LUCENE-8332 New ConcatenateGraphFilter (move/rename CompletionTokenStream)
- Closed
- is required by
-
SOLR-12376 New TaggerRequestHandler (aka SolrTextTagger)
- Closed