Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
-
None
-
Operating System: other
Platform: Other
-
23782
Description
September 10th 2003 contributionn from "Sergio Guzman-Lara" <guzman@cs.umass.edu>
Original email:
Hi all,
I have ported the kstem stemmer to Java and incorporated it to
Lucene. You can get the source code (Kstem.jar) from the following website:
http://ciir.cs.umass.edu/downloads/
Just click on "KStem Java Implementation" (you will need to register
your e-mail, for free of course, with the CIIR --Center for Intelligent
Information Retrieval, UMass – and get an access code).
Content of Kstem.jar:
java/org/apache/lucene/analysis/KStemData1.java
java/org/apache/lucene/analysis/KStemData2.java
java/org/apache/lucene/analysis/KStemData3.java
java/org/apache/lucene/analysis/KStemData4.java
java/org/apache/lucene/analysis/KStemData5.java
java/org/apache/lucene/analysis/KStemData6.java
java/org/apache/lucene/analysis/KStemData7.java
java/org/apache/lucene/analysis/KStemData8.java
java/org/apache/lucene/analysis/KStemFilter.java
java/org/apache/lucene/analysis/KStemmer.java
KStemData1.java, ..., KStemData8.java Contain several lists of words
used by Kstem
KStemmer.java Implements the Kstem algorithm
KStemFilter.java Extends TokenFilter applying Kstem
To compile
unjar the file Kstem.jar to Lucene's "src" directory, and compile it
there.
What is Kstem?
A stemmer designed by Bob Krovetz (for more information see
http://ciir.cs.umass.edu/pubfiles/ir-35.pdf).
Copyright issues
This is open source. The actual license agreement is included at the
top of every source file.
Any comments/questions/suggestions are welcome,
Sergio Guzman-Lara
Senior Research Fellow
CIIR UMass