Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-524

Current implementation of fuzzy and wildcard queries inappropriately implemented as Boolean query rewrites

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 1.9
    • None
    • core/search
    • None

    Description

      The implementation of MultiTermQuery in terms of BooleanQuery introduces several problems:

      1) Collisions with maximum clause limit on boolean queries which throws an exception. This is most problematic because it is difficult to ascertain in advance how many terms a fuzzy query or wildcard query might involve.

      2) The boolean disjunctive scoring is not appropriate for either fuzzy or wildcard queries. In effect the score is divided by the number of terms in the query which has nothing to do with the relevancy of the results.

      3) Performance of disjunctive boolean queries for large term sets is quite sub-optimal

      Attachments

        1. MultiTermQuery.java
          6 kB
          Randy Puttick
        2. MultiTermScorer.java
          2 kB
          Randy Puttick

        Issue Links

          Activity

            People

              Unassigned Unassigned
              randy@zillow.com Randy Puttick
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: