[LUCENE-3412] SloppyPhraseScorer returns non-deterministic results for queries with many repeats - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 3.1, 3.2, 3.3, 4.0-ALPHA
Fix Version/s: 3.5, 4.0-ALPHA
Component/s: core/search
Labels:
None

Description

Proximity queries with many repeats (four or more, based on my testing) return non-deterministic results. I run the same query multiple times with the same data set and get different results.

So far I've reproduced this with Solr 1.4.1, 3.1, 3.2, 3.3, and latest 4.0 trunk.

Steps to reproduce (using the Solr example):
1) In solrconfig.xml, set queryResultCache size to 0.
2) Add some documents with text "dog dog dog" and "dog dog dog dog". http://localhost:8983/solr/update?stream.body=%3Cadd%3E%3Cdoc%3E%3Cfield%20name=%22id%22%3E1%3C/field%3E%3Cfield%20name=%22text%22%3Edog%20dog%20dog%3C/field%3E%3C/doc%3E%3Cdoc%3E%3Cfield%20name=%22id%22%3E2%3C/field%3E%3Cfield%20name=%22text%22%3Edog%20dog%20dog%20dog%3C/field%3E%3C/doc%3E%3C/add%3E&commit=true
3) Do a "dog dog dog dog"~1 query. http://localhost:8983/solr/select?q=%22dog%20dog%20dog%20dog%22~1
4) Repeat step 3 many times.

Expected results: The document with id 2 should be returned.

Actual results: The document with id 2 is always returned. The document with id 1 is sometimes returned.

Different proximity values show the same bug - "dog dog dog dog"~5, "dog dog dog dog"~100, etc show the same behavior.

So far I've traced it down to the "repeats" array in SloppyPhraseScorer.initPhrasePositions() - depending on the order of the elements in this array, the document may or may not match. I think the HashSet may be to blame, but I'm not sure - that at least seems to be where the non-determinism is coming from.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-3412.patch
07/Sep/11 12:03
7 kB
Doron Cohen
LUCENE-3412.patch
06/Sep/11 07:44
2 kB
Doron Cohen

Activity

People

Assignee:: Doron Cohen

Reporter:: Michael Ryan

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 02/Sep/11 21:09

Updated:: 28/Aug/22 12:56

Resolved:: 08/Sep/11 08:23