Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-1390

add ASCIIFoldingFilter and deprecate ISOLatin1AccentFilter

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 2.9
    • modules/analysis
    • None
    • any

    • New, Patch Available

    Description

      The ISOLatin1AccentFilter is removing accents from accented characters in the ISO Latin 1 character set.
      It does what it does and there is no bug with it.

      It would be nicer, though, if there was a more comprehensive version of this code that included not just ISO-Latin-1 (ISO-8859-1) but the entire Latin 1 and Latin Extended A unicode blocks.
      See: http://en.wikipedia.org/wiki/Latin-1_Supplement_unicode_block
      See: http://en.wikipedia.org/wiki/Latin_Extended-A_unicode_block

      That way, all languages using roman characters are covered.
      A new class, ISOLatinAccentFilter is attached. It is intended to supercede ISOLatin1AccentFilter which should get deprecated.

      Attachments

        1. ASCIIFoldingFilter.patch
          207 kB
          Steven Rowe
        2. ASCIIFoldingFilter.patch
          207 kB
          Steven Rowe
        3. ASCIIFoldingFilter.patch
          204 kB
          Andi Vajda

        Issue Links

          Activity

            People

              markrmiller@gmail.com Mark Miller
              vajda Andi Vajda
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: