Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Won't Fix
-
None
-
None
-
Discovered while performing DAM searches in Adobe Experience Manager.
Description
I believe OakAnalyzer applies LowerCaseFilter and WordDelimiterFilter in the wrong order. WordDelimiterFilter is invoked with the GENERATE_WORD_PARTS flag, which splits camelCase/PascalCase into multiple terms, but since the LowerCaseFilter is applied first, the mixed-case is lost and the terms can't be split.
Searching for savings, the damAssetLucene index (which uses the default OakAnalyzer) does not find an asset named savingsAccount.svg.
Upon configuring the index's analyzers (/oak:index/damAssetLucene/analyzers) to apply WordDelimiterFilter before LowerCaseFilter, the correct behaviour was seen.
{ "jcr:primaryType": "nt:unstructured", "default": { "jcr:primaryType": "nt:unstructured", "tokenizer": { "jcr:primaryType": "nt:unstructured", "name": "Standard" }, "filters": { "jcr:primaryType": "nt:unstructured", "WordDelimiter": {"jcr:primaryType": "nt:unstructured"}, "LowerCase": {"jcr:primaryType": "nt:unstructured"} } } }