Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-10010

NGramTokenizer with SynonymFilterFacory doesn't work properly when using Managed-Schema

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Schema and Analysis
    • None

    Description

      NGramTokenizer with SynonymFilterFacory doesn't work properly when using Managed-Schema

      When using Managed-Schema, it doesn't work properly with the following settings.

      managed-schema
      <field name="bigram" type="text_bigram" indexed="true" stored="true"/>
      <fieldType name="text_bigram" class="solr.TextField" positionIncrementGap="100"
      					 autoGeneratePhraseQueries="false">
      	<analyzer type="index">
      		<tokenizer class="solr.NGramTokenizerFactory" minGramSize="2" maxGramSize="2"/>
      		<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
      						tokenizerFactory="solr.NGramTokenizerFactory"
      						tokenizerFactory.minGramSize="2" tokenizerFactory.maxGramSize="2"
      						ignoreCase="true" expand="true"/>
      	</analyzer>
      </fieldType>
      
      synonyms.txt
      ab,ba
      
      expected
      querystring => "bigram:ab"
      parsedquery => "bigram:ab bigram:ba"
      
      actual
      querystring => "bigram:ab"
      parsedquery => "bigram:ab"
      

      When using ClassicIndexSchemaFactory, works peroperly.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            issei2029 Issei Nishigata

            Dates

              Created:
              Updated:

              Slack

                Issue deployment