Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5415

Support wildcard & co in PostingsHighlighter

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.7, 6.0
    • Component/s: modules/highlighter
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      PostingsHighlighter uses the offsets encoded in the postings lists for the terms to find query matches.

      As such, it isn't really suitable for stuff like wildcards for two reasons:
      1. an expensive rewrite against the term dictionary (i think other highlighters share this problem)
      2. accumulating data from potentially many terms (e.g. reading many postings)

      However, we could provide an option for some of these queries to work, but in a different way, that avoids these downsides.

      Instead we can just grab the Automaton representation of the queries, and match it against the content directly (which won't blow up).

        Attachments

        1. LUCENE-5415.patch
          59 kB
          Robert Muir
        2. LUCENE-5415.patch
          42 kB
          Michael McCandless
        3. LUCENE-5415.patch
          38 kB
          Robert Muir
        4. LUCENE-5415.patch
          23 kB
          Robert Muir
        5. LUCENE-5415.patch
          13 kB
          Robert Muir

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              rcmuir Robert Muir
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: