[SOLR-11700] WordDelimiterGraphFilterFactory token positions - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 7.1
Fix Version/s: None
Component/s: Schema and Analysis
Labels:
None
Environment:

Mac OSX, JDK 8

Description

Token position Generated after WordDelimiterGraphFilterFactory are incorrect.

This causes problems when doing phrase searches.

As stated in the following link,
https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html

<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateAll="1"/>
</analyzer>

In: "XL-4000/ES"
Tokenizer to Filter: "XL-4000/ES"(1)
Out: "XL"(1), "4000"(2), "ES"(3), "XL4000ES"(3)

But in my Machine, notice that the concatenated word is at position 1, it should be position 3:
Out: XL4000ES"(1)", XL"(1), "4000"(2), "ES"(3), "

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Chad Siongco

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 29/Nov/17 08:28

Updated:: 08/Jun/19 15:15