Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Not A Problem
-
4.4
-
None
-
None
Description
When I have a field using CJKBigramFilter, a mysterious qs value (or what i take as qs, because it shows as ~x after the first DisjunctionMaxQuery) appears in my parsed query. The qs value that appears is the minimum of:
mm setting, number of bigrams in query string.
This makes no sense, from a retrieval standpoint. It could possibly make sense to adjust the ps value, but certainly not the qs. Moreover, changing the mm setting via an HTTP param can affect the qs, but sending in a qs parameter has no effect on the qs in the parsed query.
If I use a field in qf that has only bigrams, then qs is set to MIN(original mm setting, number of bigrams in query string)
arg sent in: q=
{!qf=cjk_bi_search pf= pf2= pf3=}旧小说旧小说 is 3 chars, so 2 bigrams
debugQuery
<str name="rawquerystring">{!qf=cjk_bi_search pf= pf2= pf3=}
旧小说</str>
<str name="querystring">
旧小说</str>
<str name="parsedquery">(+DisjunctionMaxQuery((((cjk_bi_search:旧小 cjk_bi_search:小说)~2))~0.01) ())/no_coord</str>
<str name="parsedquery_toString">+(((cjk_bi_search:旧小 cjk_bi_search:小说)~2))~0.01 ()</str>
If I use a field in qf that has only unigrams, then qs is set to MIN(original mm setting, number of unigrams in query string)
arg sent in: q=
{!qf=cjk_uni_search pf= pf2= pf3=}旧小说旧小说 is 3 chars, so 3 bigrams
debugQuery
<str name="rawquerystring">{!qf=cjk_uni_search pf= pf2= pf3=}
旧小说</str>
<str name="querystring">
旧小说</str>
<str name="parsedquery">(+DisjunctionMaxQuery((((cjk_uni_search:旧 cjk_uni_search:小 cjk_uni_search:说)~3))~0.01) ())/no_coord</str>
<str name="parsedquery_toString">+(((cjk_uni_search:旧 cjk_uni_search:小 cjk_uni_search:说)~3))~0.01 ()</str>
If I use a field in qf that has both bigrams and unigrams, then qs is set to MIN(original mm setting, number of bigrams + unigrams in query string).
arg sent in: q=
{!qf=cjk_both_search pf= pf2= pf3=}旧小说
旧小说 is 3 chars, so 3 unigrams + 2 bigrams = 5
debugQuery
<str name="rawquerystring">
<str name="querystring">{!qf=cjk_both_pub_search pf= pf2= pf3=}
旧小说</str>
<str name="parsedquery">(+DisjunctionMaxQuery((((cjk_both_search:旧 cjk_both_search:旧小 cjk_both_search:小 cjk_both_search:小说 cjk_both_search:说)~5))~0.01) ())/no_coord</str>
<str name="parsedquery_toString">+(((cjk_both_search:旧 cjk_both_search:旧小 cjk_both_search:小 cjk_both_search:小说 cjk_both_search:说)~5))~0.01 ()</str>
I am running Solr 4.4. I have fields defined like so:
<fieldtype name="text_cjk_both" class="solr.TextField" positionIncrementGap="10000" autoGeneratePhraseQueries="false">
<analyzer>
<tokenizer class="solr.ICUTokenizerFactory" />
<filter class="solr.CJKWidthFilterFactory"/>
<filter class="solr.ICUTransformFilterFactory" id="Traditional-Simplified"/>
<filter class="solr.ICUTransformFilterFactory" id="Katakana-Hiragana"/>
<filter class="solr.ICUFoldingFilterFactory"/>
<filter class="solr.CJKBigramFilterFactory" han="true" hiragana="true" katakana="true" hangul="true" outputUnigrams="true" />
</analyzer>
</fieldtype>
<fieldtype name="text_cjk_bi" class="solr.TextField" positionIncrementGap="10000" autoGeneratePhraseQueries="false">
<analyzer>
<tokenizer class="solr.ICUTokenizerFactory" />
<filter class="solr.CJKWidthFilterFactory"/>
<filter class="solr.ICUTransformFilterFactory" id="Traditional-Simplified"/>
<filter class="solr.ICUTransformFilterFactory" id="Katakana-Hiragana"/>
<filter class="solr.ICUFoldingFilterFactory"/>
<filter class="solr.CJKBigramFilterFactory" han="true" hiragana="true" katakana="true" hangul="true" outputUnigrams="false" />
</analyzer>
</fieldtype>
<fieldtype name="text_cjk_uni" class="solr.TextField" positionIncrementGap="10000" autoGeneratePhraseQueries="false">
<analyzer>
<tokenizer class="solr.ICUTokenizerFactory" />
<filter class="solr.CJKWidthFilterFactory"/>
<filter class="solr.ICUTransformFilterFactory" id="Traditional-Simplified"/>
<filter class="solr.ICUTransformFilterFactory" id="Katakana-Hiragana"/>
<filter class="solr.ICUFoldingFilterFactory"/>
</analyzer>
</fieldtype>
The request handler uses edismax:
<requestHandler name="search" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="defType">edismax</str>
<str name="q.alt">:</str>
<str name="mm">6<-1 6<90%</str>
<int name="qs">1</int>
<int name="ps">0</int>
Attachments
Issue Links
- is part of
-
SOLR-2368 Improve extended dismax (edismax) parser
-
- Open
-