Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-11700

WordDelimiterGraphFilterFactory token positions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 7.1
    • None
    • Schema and Analysis
    • None
    • Mac OSX, JDK 8

    Description

      Token position Generated after WordDelimiterGraphFilterFactory are incorrect.

      This causes problems when doing phrase searches.

      As stated in the following link,
      https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html

      <analyzer type="query">
      <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      <filter class="solr.WordDelimiterGraphFilterFactory" catenateAll="1"/>
      </analyzer>

      In: "XL-4000/ES"
      Tokenizer to Filter: "XL-4000/ES"(1)
      Out: "XL"(1), "4000"(2), "ES"(3), "XL4000ES"(3)

      But in my Machine, notice that the concatenated word is at position 1, it should be position 3:
      Out: XL4000ES"(1)", XL"(1), "4000"(2), "ES"(3), "

      Attachments

        Activity

          People

            Unassigned Unassigned
            chadsiongco Chad Siongco
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: