Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4817

Add KeywordRepeaterFilter to emit tokens twice once as keyword and once not as keyword

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 4.1
    • Fix Version/s: 4.3, 6.0
    • Component/s: modules/analysis
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      if you want to have a stemmed and an unstemmed version of a token one for recall and one for precision you have to do two fields today in most of the cases. Yet, most of the stemmers respect the keyword attribute so we could add a token filter that emits the same token twice once as keyword and once plain. Folks would most likely need to combine this RemoveDuplicatesTokenFilter but that way we can have stemmed and unstemmed version in the same field.

        Attachments

        1. LUCENE-4817.patch
          5 kB
          Simon Willnauer
        2. LUCENE-4817.patch
          10 kB
          Simon Willnauer
        3. docs.patch
          5 kB
          Erick Erickson
        4. docs.patch
          5 kB
          Erick Erickson

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                simonw Simon Willnauer
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: