Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10680

UnifiedHighlighter's term extraction not working for some query rewrites

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • None
    • None
    • modules/highlighter
    • None
    • New

    Description

      UnifiedHighlighter rewrites the query against an empty index when extracting the terms from the query (see https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/highlighter/src/java/org/apache/lucene/search/uhighlight/UnifiedHighlighter.java#L149).

      The rewrite step can unfortunately drop the terms that are to be extracted.

      Take for example the boolean query "+field:value -ConstantScore(FieldExistsQuery [field=other_field])" when highlighting on "field".

      The `FieldExistsQuery` rewrites on an empty index to a `MatchAllDocsQuery`, and as a `MUST_NOT` clause rewrites the overall boolean query to a `MatchNoDocsQuery`, dropping the `MUST` clause in the process, which means that the `field:value` term is not being extracted.

      Attachments

        Activity

          People

            Unassigned Unassigned
            ywelsch Yannick Welsch
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: