Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
2.9
-
None
-
New
Description
Due to the variouse tokenStream APIs we had in lucene analyzer subclasses need to implement at least one of the methodes returning a tokenStream. When you look at the code it appears to be almost identical if both are implemented in the same analyzer. Each analyzer defnes the same inner class (SavedStreams) which is unnecessary.
In contrib almost every analyzer uses stopwords and each of them creates his own way of loading them or defines a large number of ctors to load stopwords from a file, set, arrays etc.. those ctors should be removed / deprecated and eventually removed.
Attachments
Attachments
Issue Links
- incorporates
-
LUCENE-1967 make it easier to access default stopwords for language analyzers
- Closed
- is related to
-
LUCENE-2100 Make contrib analyzers final
- Closed
- relates to
-
LUCENE-2051 Contrib Analyzer Setters should be deprecated and replace with ctor arguments
- Closed