Details

    • New Feature
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 4.0-ALPHA
    • modules/analysis
    • None
    • New, Patch Available

    Description

      Currently, the CzechAnalyzer is merely stopwords, and there isn't a czech stemmer in snowball.

      This patch implements the light stemming algorithm described in: http://portal.acm.org/citation.cfm?id=1598600

      In their measurements, it improves MAP 42%

      The analyzer does not use this stemmer if LUCENE_VERSION <= 3.0, for back compat.

      Attachments

        1. LUCENE-2067.patch
          32 kB
          Robert Muir
        2. LUCENE-2067.patch
          32 kB
          Robert Muir
        3. LUCENE-2067.patch
          23 kB
          Robert Muir
        4. LUCENE-2067.patch
          23 kB
          Robert Muir

        Activity

          People

            rcmuir Robert Muir
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: