Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-9439

Matches API should enumerate hit fields that have no positions (no iterator)

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • None
    • None
    • None
    • None
    • New

    Description

      I have been fiddling with Matches API and it's great. There is one corner case that doesn't work for me though – queries that affect fields without positions return MatchesUtil.MATCH_WITH_NO_TERMS but this constant is problematic as it doesn't carry the field name that caused it (returns null).

      The associated fromSubMatches combines all these constants into one (or swallows them) which is another problem.

      I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with a true match (carrying field name) returning an empty iterator (or a constant "empty" iterator NO_TERMS).

      I have a very compelling use case: I wrote an "auto-highlighter" that runs on top of Matches API and automatically picks up query-relevant fields and snippets. Everything works beautifully except for cases where fields are searchable but don't have any positions (token-like fields).

      I can work on a patch but wanted to reach out first - romseygeek?

      Attachments

        1. matchhighlighter.patch
          80 kB
          Dawid Weiss
        2. LUCENE-9439.patch
          3 kB
          Alan Woodward

        Issue Links

          Activity

            People

              dweiss Dawid Weiss
              dweiss Dawid Weiss
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h 10m
                  3h 10m