Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7627

Improve TermsEnum automaton filtering APIs

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.4
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      To filter a TermsEnum by a CompiledAutomaton, we currently have a number of different possibilities:

      • Terms.intersect(CompiledAutomaton, BytesRef) - efficient, but only works on NORMAL type automata
      • CompiledAutomaton.getTerms(Terms) - efficient, works on all automaton types, but requires a Terms instead of a TermsEnum, so no use for eg SortedDocValues.termsEnum()
      • AutomatonTermsEnum - takes a TermsEnum, so it's more general than the Terms methods above, but agian only works on NORMAL automata

      It's easy to do the wrong thing here, and at the moment we only guard against incorrect usage via runtime checks (see eg LUCENE-7576, https://github.com/flaxsearch/marple/issues/24). We should try and clean this up.

        Attachments

        1. LUCENE-7627.patch
          5 kB
          Alan Woodward
        2. LUCENE-7627.patch
          5 kB
          Alan Woodward

          Activity

            People

            • Assignee:
              romseygeek Alan Woodward
              Reporter:
              romseygeek Alan Woodward
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: