Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10454

UnifiedHighlighter can miss terms because of query rewrites

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • New

    Description

      Before extracting terms from a query, UnifiedHighlighter rewrites the query using an empty searcher. If the query rewrites to MatchNoDocsQuery when the reader is empty, then the highlighter will fail to extract terms. This is more of an issue now that we rewrite BooleanQuery to MatchNoDocsQuery when any of its required clauses is MatchNoDocsQuery (https://issues.apache.org/jira/browse/LUCENE-10412). I attached a patch showing the problem.

      This feels like a pretty esoteric issue, but I figured it was worth raising for awareness. I think it only applies when weightMatches=false, which isn't the default. I couldn't find any existing queries in Lucene that would be affected.

      We ran into it while upgrading Elasticsearch to the latest Lucene snapshot, since a couple custom queries rewrite to MatchNoDocsQuery when the reader is empty.

      Attachments

        1. LUCENE-10454.patch
          3 kB
          Julie Tibshirani
        2. LUCENE-10454-fix.patch
          1 kB
          David Smiley

        Activity

          People

            Unassigned Unassigned
            julietibs Julie Tibshirani
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: