Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5718

More flexible compound queries (containing mtq) support in postings highlighter

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 4.8.1
    • Fix Version/s: None
    • Component/s: modules/highlighter
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The postings highlighter currently pulls the automata from multi term queries and doesn't require calling rewrite to make highlighting work. In order to do so it also needs to check whether the query is a compound one and eventually extract its subqueries. This is currently done in the MultiTermHighlighting class and works well but has two potential problems:

      1) not all the possible compound queries are necessarily supported as we need to go over each of them one by one (see LUCENE-5717) and this requires keeping the "switch" up-to-date if new queries gets added to lucene
      2) it doesn't support custom compound queries but only the set of queries available out-of-the-box

      I've been thinking about how this can be improved and one of the ideas I came up with is to introduce a generic way to retrieve the subqueries from compound queries, like for instance have a new abstract base class with a getLeaves or getSubQueries method and have all the compound queries extend it. What this method would do is return a flat array of all the leaf queries that the compound query is made of.

      Not sure whether this would be needed in other places in lucene, but it doesn't seem like a small change and it would definitely affect (or benefit?) more than just the postings highlighter support for multi term queries.

      In particular the second problem (custom queries) seems hard to solve without a way to expose this info directly from the query though, unless we want to make the MultiTermHighlighting#extractAutomata method extensible in some way.

      Would like to hear what people think and work on this as soon as we identified which direction we want to take.

        Attachments

        1. LUCENE-5718.patch
          62 kB
          Luca Cavanna

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                lucacavanna Luca Cavanna
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: