[LUCENE-7533] Classic query parser: autoGeneratePhraseQueries=true doesn't work when splitOnWhitespace=false - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 6.2, 6.2.1, 6.3
Fix Version/s: 6.4, 7.0
Component/s: None
Labels:
None

Lucene Fields:

New

Description

~~LUCENE-2605~~ introduced the classic query parser option to not split on whitespace prior to performing analysis.

From the javadocs for QueryParser.setAutoGeneratePhraseQueries():

phrase queries will be automatically generated when the analyzer returns more than one term from whitespace delimited text.

When splitOnWhitespace=false, the output from analysis can now come from multiple whitespace-separated tokens, which breaks code assumptions when autoGeneratePhraseQueries=true: for this combination of options, it's not appropriate to auto-quote multiple non-overlapping tokens produced by analysis. E.g. simple whitespace tokenization over the query "some words" will produce the token sequence ("some", "words"), and even when autoGeneratePhraseQueries=true, we should not be creating a phrase query here.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-7533.patch
02/Nov/16 00:08
21 kB
Steven Rowe
LUCENE-7533-disallow-option-combo.patch
18/Nov/16 00:16
10 kB
Steven Rowe

Issue Links

is related to

SOLR-10348 unexpected sow=false interaction with defaultSearchField

Closed

Activity

People

Assignee:: Steven Rowe

Reporter:: Steven Rowe

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 01/Nov/16 23:38

Updated:: 28/Aug/22 15:05

Resolved:: 18/Nov/16 00:24