Details
-
Bug
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
None
-
None
-
New
Description
spinoff of user list discussion - SpanNearQuery - inOrder parameter.
With 3 documents:
- "a b x c d"
- "a b b d"
- "a b x b y d"
Here are a few queries (the number in parenthesis indicates expected #hits):
These ones work as expected:
- (1) in-order, slop=0, "b", "x", "b"
- (1) in-order, slop=0, "b", "b"
- (2) in-order, slop=1, "b", "b"
These ones match too many hits:
- (1) any-order, slop=0, "b", "x", "b"
- (1) any-order, slop=1, "b", "x", "b"
- (1) any-order, slop=2, "b", "x", "b"
- (1) any-order, slop=3, "b", "x", "b"
These ones match too many hits as well:
- (1) any-order, slop=0, "b", "b"
- (2) any-order, slop=1, "b", "b"
Each of the above passes when using a phrase query (applying the slop, no in-order indication in phrase query).
This seems related to a known overlapping spans issue - non-overlapping Span queries - as indicated by Hoss, so we might decide to close this bug after all, but I would like to at least have the junit that exposes the behavior in JIRA.
Attachments
Attachments
Issue Links
- is duplicated by
-
LUCENE-5932 SpanNearUnordered duplicate term counts itself as a match
- Resolved