Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-2211

Create Solr FilterFactory for Lucene StandardTokenizer with UAX#29 support

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 3.1
    • 3.1, 4.0-ALPHA
    • None
    • None

    Description

      The Lucene 3.x StandardTokenizer with UAX#29 support provides benefits for non-English tokenizing. Presently it can be invoked by using the StandardTokenizerFactory and setting the Version to 3.1. However, it would be useful to be able to use the improved unicode processing without necessarily including the ip address and email address processing of StandardAnalyzer. A FilterFactory that allowed the use of the StandardTokenizer with UAX#29 support on its own would be useful.

      Attachments

        1. SOLR-2211.patch
          6 kB
          Tom Burton-West

        Issue Links

          Activity

            People

              rcmuir Robert Muir
              tburtonwest Tom Burton-West
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: