Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8983

PhraseWildcardQuery - new query to control and optimize wildcard expansions in phrase

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      A generalized version of PhraseQuery, built with one or more MultiTermQuery that provides term expansions for multi-terms (one of the expanded terms must match).

      Its main advantage is to control the total number of expansions across all MultiTermQuery and across all segments.

      This query is similar to MultiPhraseQuery, but it handles, controls and optimizes the multi-term expansions.

      This query is equivalent to building an ordered SpanNearQuery with a list of SpanTermQuery and SpanMultiTermQueryWrapper.
      But it optimizes the multi-term expansions and the segment accesses.
      It first resolves the single-terms to early stop if some does not match. Then it expands each multi-term sequentially, stopping immediately if one does not match. It detects the segments that do not match to skip them for the next expansions. This often avoid expanding the other multi-terms on some or even all segments. And finally it controls the total number of expansions.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                bruno.roustant Bruno Roustant
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 40m
                  2h 40m