[LUCENE-4185] CharFilters being added twice in Solr - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 4.0-ALPHA
Fix Version/s: 4.0-BETA, 6.0
Component/s: modules/analysis
Labels:
None

Lucene Fields:

New

Description

Debugging one of my test cases, I found that a TokenStream from an Analyzer constructed by Solr contains the configured chain of CharFilters twice.

While I may be mistaken, the fix for ~~LUCENE-4142~~ appears to make the fix for ~~LUCENE-3721~~ unnecessary, and the combination of the fixes results in the repeated application of the CharFilters.

I came across this with a test case involving an HTMLStripCharFilter, where the input string contains "<h1>". After passing through one HTMLStripCharFilter, it becomes "<h1>", and then the HTML is removed by the second filter.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-4185.patch
06/Jul/12 15:29
10 kB
Robert Muir

Activity

People

Assignee:: Unassigned

Reporter:: Michael Froh

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 02/Jul/12 14:44

Updated:: 28/Aug/22 13:20

Resolved:: 07/Jul/12 00:39