Snowball stemmer for Lithuanian language.
The patch file only contains autogenerated code. where is the snowball source?
Snowball source code can be found here.
Also, I've made a pull request to the official snowball repository 9 days ago.
OK, until they incorporate it, i think we should add the .sbl file here as well so its available.
And you can confirm its ok for us to release this under the apache 2.0 license?
I will take a deeper look and comment again after!
Snowball source code.
Yes, I confirm that it is OK to use Apache 2.0 license.
Updated patch adding LithuanianAnalyzer, a stopwords set, and some basic tests.
Thank for the contribution here! We really need support for this language and I like the stemmer. It seems to work well with nouns and adjectives and does not seem to suffer from overstemming issues.
If you have a chance, please have a look at the latest patch. I will look into this more tomorrow.
Thanks! When designing the stemmer, I've put emphasis on improving noun stemming because majority of complaints from our clients were mistakes in noun stemming. Of course, Lithuanian language is complicated and there are enough space for improvements.
I've checked the latest patch and found no problems.
Thanks for checking! I plan to commit this later today.
Commit 1692544 from Robert Muir in branch 'dev/trunk'
[ https://svn.apache.org/r1692544 ]
LUCENE-6694: Add LithuanianAnalyzer and LithuanianStemmer
Commit 1692547 from Robert Muir in branch 'dev/branches/branch_5x'
[ https://svn.apache.org/r1692547 ]
Bulk close for 5.3.0 release