Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7578

UnifiedHighlighter: Convert PhraseHelper to use SpanCollector API

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: modules/highlighter
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The PhraseHelper of the UnifiedHighlighter currently collects position-spans per SpanQuery (and it knows which terms are in which SpanQuery), and then it filters PostingsEnum based on that. It's similar to how the original Highlighter WSTE works. The main problem with this approach is that it can be inaccurate for some nested span queries – LUCENE-2287, LUCENE-5455 (has the clearest example), LUCENE-6796. Non-nested SpanQueries (e.g. that which is converted from a PhraseQuery or MultiPhraseQuery) are not a problem.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              dsmiley David Smiley
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: