Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-5152

EdgeNGramFilterFactory deletes token

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 4.4
    • None
    • None
    • None

    Description

      I am using EdgeNGramFilterFactory in my schema.xml

      <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
          <!-- ... -->
          <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="10" side="front" />
        </analyzer>
      </fieldType>

      Some tokens in my index only consist of one character, let's say R. minGramSize is set to 2 and is bigger than the length of the token. I expected the NGramFilter to left R unchanged but in fact it is deleting the token.

      For my use case this interpretation is undesirable, and probably for most use cases too!?

      Attachments

        1. SOLR-5152.patch
          22 kB
          Furkan Kamaci
        2. SOLR-5152-v5.0.0.patch
          19 kB
          Sergey Urushkin

        Issue Links

          Activity

            People

              Unassigned Unassigned
              christoph.lingg Christoph Lingg
              Votes:
              4 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: