Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.3, 4.0-ALPHA
    • Component/s: modules/analysis
    • Labels:
      None
    • Environment:

      Operating System: other
      Platform: Other

      Description

      September 10th 2003 contributionn from "Sergio Guzman-Lara" <guzman@cs.umass.edu>

      Original email:

      Hi all,

      I have ported the kstem stemmer to Java and incorporated it to
      Lucene. You can get the source code (Kstem.jar) from the following website:

      http://ciir.cs.umass.edu/downloads/

      Just click on "KStem Java Implementation" (you will need to register
      your e-mail, for free of course, with the CIIR --Center for Intelligent
      Information Retrieval, UMass – and get an access code).

      Content of Kstem.jar:

      java/org/apache/lucene/analysis/KStemData1.java
      java/org/apache/lucene/analysis/KStemData2.java
      java/org/apache/lucene/analysis/KStemData3.java
      java/org/apache/lucene/analysis/KStemData4.java
      java/org/apache/lucene/analysis/KStemData5.java
      java/org/apache/lucene/analysis/KStemData6.java
      java/org/apache/lucene/analysis/KStemData7.java
      java/org/apache/lucene/analysis/KStemData8.java
      java/org/apache/lucene/analysis/KStemFilter.java
      java/org/apache/lucene/analysis/KStemmer.java

      KStemData1.java, ..., KStemData8.java Contain several lists of words
      used by Kstem
      KStemmer.java Implements the Kstem algorithm
      KStemFilter.java Extends TokenFilter applying Kstem

      To compile

      unjar the file Kstem.jar to Lucene's "src" directory, and compile it
      there.

      What is Kstem?

      A stemmer designed by Bob Krovetz (for more information see
      http://ciir.cs.umass.edu/pubfiles/ir-35.pdf).

      Copyright issues

      This is open source. The actual license agreement is included at the
      top of every source file.

      Any comments/questions/suggestions are welcome,

      Sergio Guzman-Lara
      Senior Research Fellow
      CIIR UMass

      1. lucid_kstem.tgz
        650 kB
        Yonik Seeley
      2. LUCENE-152.patch
        388 kB
        Robert Muir
      3. LUCENE-152_optimization.patch
        0.9 kB
        Yonik Seeley
      4. LUCENE-152_optimization.patch
        2 kB
        Yonik Seeley
      5. LUCENE-152_alt.patch
        1 kB
        Robert Muir
      6. kstemTestData.zip
        54 kB
        Robert Muir

        Issue Links

          Activity

            People

            • Assignee:
              Robert Muir
              Reporter:
              Otis Gospodnetic
            • Votes:
              9 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development