Details
-
Improvement
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
4.0-ALPHA
-
N/A
-
New, Patch Available
Description
PorterStemFilter has functionality to detect if a term has been marked as a "keyword" by the KeywordMarkerFilter (KeywordAttribute.isKeyword() == true), and if so, skip stemming.
The suggestion is to have the same functionality in other filters where it is applicable. I think it may be particularly applicable to the LowerCaseFilter (ie if it is a keyword, don't mess with the case), and StopFilter (if it is a keyword, then don't filter it out even if it looks like a stop word).
Backward compatibility is maintained (in both cases) by adding a new constructor which takes an additional boolean parameter ignoreKeyword. The current constructor will call this new constructor with ignoreKeyword = false.
Patches are attached (for LowerCaseFilter and StopFilter).
I have verified that the analysis JUnit tests run against the updated code, ie, backward compatibility is maintained.