Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8767

DisjunctionMaxQuery do not work well when multiple search term+mm+query fields with different fieldType.

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 7.3
    • Fix Version/s: None
    • Component/s: core/queryparser
    • Labels:
    • Environment:
    • Lucene Fields:
      New

      Description

      When multiple fields in query fields came from different fieldType, especially one from KeywordTokenizerFactory, another from WhitespaceTokenizerFactory, then the generated parse query could not honor synonyms and mm, which hit incorrect documents. The following is my detail:

      1. We use Solr 7.3.1
      2. Our qf=name^10 partNumber_ntk, while fieldType of name use solr.WhitespaceTokenizerFactory and solr.WordDelimiterFilterFactory, while  partNumber_ntk is not tokenized and use solr.KeywordTokenizerFactory
      3. mm=2<3 4<5 6<-80%25
      4. The search term is versatil sundress, while 'versatile' and 'testing' are synonyms, we have documents named " Versatil Empire Waist Sundress" which should be hit, but failed.
      5. We test same query on Solr 5.5.4, it works fine, it do not work on Solr 7.3.1.

      q=

      (Versatil%20testing)%20sundress&fl=name&defType=edismax&mm=2<3 4<5 6<-80%25&qf=name^10%20partNumber_ntk&debugQuery=true&wt=xml&rows=100

      parsedQuery:

      +(DisjunctionMaxQuery((((name:versatil name:test)~2)^10.0 | partNumber_ntk:versatil testing)) DisjunctionMaxQuery(((name:sundress)^10.0 | partNumber_ntk:sundress)))~2

      Which seems it incorrect parse name to: name:versatil name:test

      If I change the query fields to same fieldType, for example,shortDescription is in same fieldType of name:

      q=(Versatil%20testing)%20sundress&fl=name&defType=edismax&mm=2<3 4<5 6<-80%25&qf=name^10%20shortDescription&debugQuery=true&wt=xml&rows=100

      ParsedQuery:

      +((DisjunctionMaxQuery(((name:versatil)^10.0 | shortDescription:versatil)) DisjunctionMaxQuery(((name:test)^10.0 | shortDescription:test))) DisjunctionMaxQuery(((name:sundress)^10.0 | shortDescription:sundress)))~2

      which hits correctly.

      Could someone check this or tell us a quick workaround? Now it have big impact on customer.

      Thanks in advance! The following is backup information:

       

       

       

        Attachments

        1. a.diff
          3 kB
          Chongchen Chen

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              ZhongHua ZhongHua Wu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: