Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
6.6
-
None
Description
There seems to be an issue when doing proximity searches that include terms that have multi-word synonyms.
Example:
consider there's is configured in synonyms.txt
(
grand mother, grandmother
grandfather, granddad
)
and there's an indexed field with: (My mother and my grandmother went...)
Proximity search with: ("mother grandmother"~8)
won't return the file, while ("father grandfather"~8) does return the analogous file.
I am not a developer of Solr, so pardon if I am wrong, but I ran it with debug=query and saw that when proximity searches are done with multi-term synonyms, the called function is spanNearQuery:
"parsedquery":"SpanNearQuery(spanNear([laudo:mother,
spanOr([laudo:grand mother, laudo:grandmother])],0, true))"
while proximity searches with one-term synonyms are executed with:
"MultiPhraseQuery(laudo:\"father (grandfather granddad)\"~10)"
Note that the SpanNearQuery is called with a slope parameter of 0, no matter what is passed after the tilde. So if I search the exact phrase it does match.
Here is my field-type, just in case:
<fieldType name="text_pt_synonyms_ascii_minimal_lightStem" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" format="snowball" words="lang/stopwords_pt.txt" ignoreCase="true"/>
<filter class="solr.PortugueseLightStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" format="snowball" words="lang/stopwords_pt.txt" ignoreCase="true"/><filter class="solr.ASCIIFoldingFilterFactory" preserveOriginal="true"/>
<filter class="solr.SynonymGraphFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms_radex.txt"/>
<filter class="solr.PortugueseLightStemFilterFactory"/>
</analyzer>
</fieldType>