Issue Details (XML | Word | Printable)

Key: LUCENE-1134
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Minor Minor
Assignee: Unassigned
Reporter: Shai Erera
Votes: 1
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

BooleanQuery.rewrite does not work properly for minNumberShouldMatch

Created: 15/Jan/08 07:08 PM   Updated: 03/Feb/08 08:21 AM
Return to search
Component/s: Search
Affects Version/s: 2.4
Fix Version/s: None

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works LUCENE-1134.patch 2008-01-16 06:07 AM Shai Erera 1 kB

Lucene Fields: Patch Available, New
Resolution Date: 03/Feb/08 08:21 AM


 Description  « Hide
BooleanQuery.rewrite does not respect minNumberShouldMatch if the number of clauses is 1. This causes inconsistencies for the queries "+def" and "+abc +def", while setting the minNumShouldMatch to '1' for both.
For the first query, results are returned although there are no SHOULD clauses in the query.
For the second query no results are returned.
The reason lies in the optimization BooleanQuery.rewrite has for one clauses queries.
Patch included - optimize the query for a single clause only if the minNumShouldMatch <= 0.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Shai Erera added a comment - 15/Jan/08 07:10 PM
The fix is very trivial. Which tests should I run to ensure this doesn't break anything?

Paul Elschot added a comment - 15/Jan/08 07:19 PM
I'd start with these tests:

ant -Dtestcase='Test*Bool*' test-core

You might also consider adding a test case in one of those tests that shows the bug without the patch applied.


Hoss Man added a comment - 15/Jan/08 07:34 PM
At first glance, the patch seems right .. but an optimization can probably be made ... if there is only once clause, and minNrShouldMatch is greater then 1, the query can rewrite itself to a "no-op" type query that doesn't match any docs.

Actually: BooleanQuery.rewrite can do something like that anytime minNrShouldMath is greater then the number of optional clauses.


Paul Elschot added a comment - 15/Jan/08 07:44 PM
There is no real need for this optimization because BooleanScorer2 already uses a NonMatchingScorer for this case.

Hoss Man added a comment - 15/Jan/08 08:06 PM
Ah... good call.

(Hmmm.... it stil might be worth doing just to make it clear to clients earlier rather then later that they won't match anything. ie:If you inspect a rewritten FuzzyQuery you get to see right away how many clauses it has ... why not let people inspecting a rewritten BooleanQuery in cases like this see right away that their query is a No-Op ?)


Paul Elschot added a comment - 15/Jan/08 09:42 PM
A FuzzyQuery rewrites to a BooleanQuery with SHOULD clauses only, so a check and cast to BooleanQuery and then a clauses().size() == 0 can check for a No-Op after a rewrite().

Hoss Man added a comment - 15/Jan/08 09:48 PM
I think you're misunderstanding me Paul: i was just trying to make an analogy – even if the Scorer optimizes away a BooleanQuery that can never match anything into a NonMatchingScorer, there are other advantages to having rewrite detect these situations (when it can) ... inspection of the rewritten queries being an example.

Shai Erera added a comment - 16/Jan/08 06:03 AM
Hi

W.r.t. testing, I found TestBooleanMinShouldMatch. It had testNoOptionalButMin which added two MUST clauses. I added testNoOptionalButMin2 that adds only a single clause. It fails without the patch and passes with the patch. Other than that the tests in this class pass.
I updated the patch to include the additional test.

W.r.t. the optimization that was discussed: I don't understand how it can be done. BooleanQuery has a clauses list. It is not aware of any SHOULD, MUST or MUST_NOT clauses in its rewrite() method. So how exactly this optimization can be done? The only class that seems to be aware of the type of clauses is BooleanScorer2 in its add() method.


Shai Erera added a comment - 16/Jan/08 06:07 AM
Fix + Test

Michael Busch added a comment - 03/Feb/08 08:21 AM
Committed. Thanks, Shai!