Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
None
-
New, Patch Available
Description
While ICUNormalizer2FilterFactory supports a filter attribute to define a Unicode set filter, ICUFoldingFilterFactory does not support it. A filter allows one to e.g. exclude a set of characters from being folded. E.g. for Finnish and Swedish the filter could be defined like this:
<filter class="solr.ICUFoldingFilterFactory" filter="[^åäöÅÄÖ]"/>
Note: An additional MappingCharFilterFactory or solr.LowerCaseFilterFactory would be needed for lowercasing the characters excluded from folding. This is similar to what ElasticSearch provides (see https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu-folding.html).
I'll add a patch that does this similar to ICUNormalizer2FilterFactory. Applies at least to master and branch_7x.