Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Invalid
-
4.7
-
None
Description
WordDelimiterFilterFactory generates word parts although splitting configuration is deactivated.
This is the fieldType setup from my schema:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" stemEnglishPossessive="0" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0" splitOnNumerics="0" preserveOriginal="1"/> <filter class="solr.LowerCaseFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.SynonymFilterFactory" synonyms="lang/synonyms_de.txt" ignoreCase="true" expand="true" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_de.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="1" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0" splitOnNumerics="0" preserveOriginal="1"/> <filter class="solr.LowerCaseFilterFactory" /> </analyzer> </fieldType>
The given search term is: X-002-99-495
WordDelimiterFilterFactory indexes the following word parts:
- X-002-99-495
- X (shouldn't be there)
- 00299495 (shouldn't be there)
- X00299495
But the 'X' should not be indexed or queried as a single term. You can see that splitting is completely deactivated in the schema.
I can move the charater part around in the search term:
Searching for 002-abc-99-495 gives me
- 002-abc-99-495
- 002 (shouldn't be there)
- abc (shouldn't be there)
- 99495 (shouldn't be there)
- 002abc99495
Even if the term has te following content - WDF split's it up (F00-22-761):
- F00-22-761
- F00 (shouldn't be there)
- 22761 (shouldn't be there)
- F0022761
Please have a look at the screenshot.
This is not what I expect from the configuration! I think this must be a bug.