Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8347

BlendedInfixSuggester to handle multi term matches better


    • Type: Improvement
    • Status: Patch Available
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 7.3.1
    • Fix Version/s: None
    • Component/s: core/search
    • Labels:
    • Lucene Fields:
      New, Patch Available


      Currently the blendedInfix suggester considers just the first match position when scoring a suggestion.
      From the lucene-dev mailing list :
      If I write more than one term in the query, let's say 
      "Mini Bar Fridge" 
      I would expect in the results something like (note that allTermsRequired=true and the schema weight field always returns 1000)

      • Mini Bar Fridge something
      • Mini Bar Fridge something else
      • Mini Bar something Fridge        
      • Mini Bar something else Fridge
      • Mini something Bar Fridge
        Instead I see this: 
        Mini Bar something Fridge        
        Mini Bar something else Fridge
        Mini Bar Fridge something
        Mini Bar Fridge something else
        Mini something Bar Fridge
        After having a look at the suggester code (BlendedInfixSuggester.createCoefficient), I see that the component takes in account only one position, which is the lowest position (among the three matching terms) within the term vector ("mini" in the example above) so all the suggestions above have the same weight 
        Scope of this Jira issue is to improve the BlendedInfix to better manage those scenarios.


        1. LUCENE-8347.patch
          23 kB
          Alessandro Benedetti
        2. LUCENE-8347.patch
          23 kB
          Alessandro Benedetti

          Issue Links



              • Assignee:
                alessandro.benedetti Alessandro Benedetti
              • Votes:
                2 Vote for this issue
                6 Start watching this issue


                • Created: