Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-152

[PATCH] KStem for Lucene

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 3.3, 4.0-ALPHA
    • modules/analysis
    • None
    • Operating System: other
      Platform: Other

    • 23782

    Description

      September 10th 2003 contributionn from "Sergio Guzman-Lara" <guzman@cs.umass.edu>

      Original email:

      Hi all,

      I have ported the kstem stemmer to Java and incorporated it to
      Lucene. You can get the source code (Kstem.jar) from the following website:

      http://ciir.cs.umass.edu/downloads/

      Just click on "KStem Java Implementation" (you will need to register
      your e-mail, for free of course, with the CIIR --Center for Intelligent
      Information Retrieval, UMass – and get an access code).

      Content of Kstem.jar:

      java/org/apache/lucene/analysis/KStemData1.java
      java/org/apache/lucene/analysis/KStemData2.java
      java/org/apache/lucene/analysis/KStemData3.java
      java/org/apache/lucene/analysis/KStemData4.java
      java/org/apache/lucene/analysis/KStemData5.java
      java/org/apache/lucene/analysis/KStemData6.java
      java/org/apache/lucene/analysis/KStemData7.java
      java/org/apache/lucene/analysis/KStemData8.java
      java/org/apache/lucene/analysis/KStemFilter.java
      java/org/apache/lucene/analysis/KStemmer.java

      KStemData1.java, ..., KStemData8.java Contain several lists of words
      used by Kstem
      KStemmer.java Implements the Kstem algorithm
      KStemFilter.java Extends TokenFilter applying Kstem

      To compile

      unjar the file Kstem.jar to Lucene's "src" directory, and compile it
      there.

      What is Kstem?

      A stemmer designed by Bob Krovetz (for more information see
      http://ciir.cs.umass.edu/pubfiles/ir-35.pdf).

      Copyright issues

      This is open source. The actual license agreement is included at the
      top of every source file.

      Any comments/questions/suggestions are welcome,

      Sergio Guzman-Lara
      Senior Research Fellow
      CIIR UMass

      Attachments

        1. kstemTestData.zip
          54 kB
          Robert Muir
        2. LUCENE-152_alt.patch
          1 kB
          Robert Muir
        3. LUCENE-152_optimization.patch
          2 kB
          Yonik Seeley
        4. LUCENE-152_optimization.patch
          0.9 kB
          Yonik Seeley
        5. LUCENE-152.patch
          388 kB
          Robert Muir
        6. lucid_kstem.tgz
          650 kB
          Yonik Seeley

        Issue Links

          Activity

            People

              rcmuir Robert Muir
              otis@apache.org Otis Gospodnetic
              Votes:
              9 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: