Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2788

Make CharFilter reusable

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • modules/analysis
    • None
    • New, Patch Available

    Description

      The CharFilter API lets you wrap a Reader, altering the contents before the Tokenizer sees them.
      It also allows you to correct the offsets so this is transparent to highlighting.

      One problem is that the API isn't reusable, if you have a lot of short documents its going to be efficient.
      Additionally there is some unnecessary wrapping in Tokenizer (see the CharReader.get in the ctor, but not in reset(Reader)!!!)

      Attachments

        1. LUCENE-2788.patch
          55 kB
          Robert Muir

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rcmuir Robert Muir
              Votes:
              2 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: