Details
-
Task
-
Status: Reopened
-
Minor
-
Resolution: Fixed
-
None
-
None
-
New
Description
Each built-in analysis component (factory of tokenizer / char filter / token filter) has a SPI name but currently this is not documented anywhere.
The goals of this issue:
- Define SPI names as static final field for each analysis component so that users can get the component by name (via NAME static field.) This also provides compile time safety.
- Officially document the SPI names in Javadocs.
- Add proper source validation rules to ant validate-source-patterns target so that we can make sure that all analysis components have correct field definitions and documentation
and,
- Lookup SPI names on the new NAME fields. Instead deriving those from class names.
(Just for quick reference) we now have:
- 19 Tokenizers (TokenizerFactory.availableTokenizers())
- 6 CharFilters (CharFilterFactory.availableCharFilters())
- 118 TokenFilters (TokenFilterFactory.availableTokenFilters())
Attachments
Attachments
Issue Links
- blocks
-
LUCENE-8566 Deprecate methods in CustomAnalyzer.Builder which take factory classes
- Resolved
- is related to
-
LUCENE-8874 Show SPI names only instead of class names in Luke Analysis tab
- Closed
-
LUCENE-8907 Provide backward compatibility for loading analysis factories
- Closed
- links to