[SOLR-11976] TokenizerChain is overwriting, not chaining TokenFilters in normalize() - ASF JIRA

Attach files

Attach Screenshot

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: 8.0
Fix Version/s: 7.3
Component/s: search
Labels:
None

Description

TokenizerChain is overwriting, not chaining tokenfilters in normalize.

This doesn't currently break search because normalize is not being used at the Solr level (AFAICT); rather, TextField has its own analyzeMultiTerm() that duplicates code from the newer normalize.

Code as is:

    TokenStream result = in;
    for (TokenFilterFactory filter : filters) {
      if (filter instanceof MultiTermAwareComponent) {
        filter = (TokenFilterFactory) ((MultiTermAwareComponent) filter).getMultiTermComponent();
        result = filter.create(in);
      }
    }

The fix is simple:

-        result = filter.create(in);
+        result = filter.create(result);

Attachments

Issue Links

Add Link

relates to

SOLR-12034 Replace TokenizerChain in Solr with Lucene's CustomAnalyzer

Resolved

Delete this link

links to

GitHub Pull Request #322

Delete this link

GitHub Pull Request #322

Delete this link

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: David Smiley

Reporter:: Tim Allison

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 12/Feb/18 21:08

Updated:: 08/Jun/19 15:14

Resolved:: 09/Mar/18 03:35

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

TokenizerChain is overwriting, not chaining TokenFilters in normalize()

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Time Tracking

Agile

Slack

Issue deployment