Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
None
-
None
Description
While dealing with synonym at query time, solr failed to work with multi-word synonyms due to some reasons:
- First the lucene queryparser tokenizes user query by space so it split multi-word term into two terms before feeding to synonym filter, so synonym filter can't recognized multi-word term to do expansion
- Second, if synonym filter expand into multiple terms which contains multi-word synonym, The SolrQueryParseBase currently use MultiPhraseQuery to handle synonyms. But MultiPhraseQuery don't work with term have different number of words.
For the first one, we can extend quoted all multi-word synonym in user query so that lucene queryparser don't split it. There are a jira task related to this one https://issues.apache.org/jira/browse/LUCENE-2605.
For the second, we can replace MultiPhraseQuery by an appropriate BoleanQuery SHOULD which contains multiple PhraseQuery in case tokens stream have multi-word synonym.
Attachments
Attachments
Issue Links
- duplicates
-
SOLR-9185 Solr's edismax and "Lucene"/standard query parsers should optionally not split on whitespace before sending terms to analysis
- Closed
-
SOLR-4381 Query-time multi-word synonym expansion
- Closed
-
SOLR-10343 Update Solr default/example and test configs to use SynonymGraphFilterFactory
- Closed
- is related to
-
LUCENE-2605 queryparser parses on whitespace
- Closed
-
LUCENE-4499 Multi-word synonym filter (synonym expansion)
- Resolved
-
SOLR-4381 Query-time multi-word synonym expansion
- Closed
- relates to
-
SOLR-9185 Solr's edismax and "Lucene"/standard query parsers should optionally not split on whitespace before sending terms to analysis
- Closed
-
LUCENE-3130 Use BoostAttribute in in TokenFilters to denote Terms that QueryParser should give lower boosts
- Open