Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-11954

Search behavior depends on kind of synonym mappings

Agile BoardAttach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 7.2.1
    • None
    • None

    Description

      For field with such type

      <fieldtype name="fulltext_en" class="solr.TextField" autoGeneratePhraseQueries="true">
         <analyzer type="index">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.WordDelimiterGraphFilterFactory"
      generateWordParts="1" generateNumberParts="1" splitOnNumerics="1"
      catenateWords="1" catenateNumbers="1" catenateAll="0" preserveOriginal="1" protected="protwords_en.txt"/>
            <filter class="solr.FlattenGraphFilterFactory"/>
         </analyzer>
         <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.WordDelimiterGraphFilterFactory"
      generateWordParts="1" generateNumberParts="1" splitOnNumerics="1"
      catenateWords="0" catenateNumbers="0" catenateAll="0" preserveOriginal="1" protected="protwords_en.txt"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.SynonymFilterFactory"
      synonyms="synonyms_en.txt" ignoreCase="true" expand="true"/>
         </analyzer>
      </fieldtype>

       If synonyms configured in next way

      b=>b,boron
      2=>ii,2

      Then for query "my_field:b2" parsedQuery looks so "my_field:b2 Synonym(my_field:2 my_field:ii)"

      But when synonyms configured in such way

      b,boron
      ii,2

      Then for query "my_field:b2" parsedQuery looks so "my_field:b2 my_field:\"b 2\" my_field:\"b ii\" my_field:\"boron 2\" my_field:\"boron ii\")"

      The second query is correct (it uses synonyms for two parts after word split). 

      Search behavior should not depends on kind of synonym mappings.

      This issue also has been discussed in solr user mailing list
      http://lucene.472066.n3.nabble.com/SynonymGraphFilterFactory-with-WordDelimiterGraphFilterFactory-usage-td4373974.html

      It reproduced for me for Solr 7.1.0, but it also can be reproduced for 7.2.1 version

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            ashestak Alexandr

            Dates

              Created:
              Updated:

              Slack

                Issue deployment