Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5415

Support wildcard & co in PostingsHighlighter

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 4.7, 6.0
    • modules/highlighter
    • None
    • New

    Description

      PostingsHighlighter uses the offsets encoded in the postings lists for the terms to find query matches.

      As such, it isn't really suitable for stuff like wildcards for two reasons:
      1. an expensive rewrite against the term dictionary (i think other highlighters share this problem)
      2. accumulating data from potentially many terms (e.g. reading many postings)

      However, we could provide an option for some of these queries to work, but in a different way, that avoids these downsides.

      Instead we can just grab the Automaton representation of the queries, and match it against the content directly (which won't blow up).

      Attachments

        1. LUCENE-5415.patch
          59 kB
          Robert Muir
        2. LUCENE-5415.patch
          42 kB
          Michael McCandless
        3. LUCENE-5415.patch
          38 kB
          Robert Muir
        4. LUCENE-5415.patch
          23 kB
          Robert Muir
        5. LUCENE-5415.patch
          13 kB
          Robert Muir

        Activity

          People

            Unassigned Unassigned
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: