Lucene - Core
  1. Lucene - Core
  2. LUCENE-3120

span query matches too many docs when two query terms are the same unless inOrder=true

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.9, Trunk
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      spinoff of user list discussion - SpanNearQuery - inOrder parameter.

      With 3 documents:

      • "a b x c d"
      • "a b b d"
      • "a b x b y d"

      Here are a few queries (the number in parenthesis indicates expected #hits):

      These ones work as expected:

      • (1) in-order, slop=0, "b", "x", "b"
      • (1) in-order, slop=0, "b", "b"
      • (2) in-order, slop=1, "b", "b"

      These ones match too many hits:

      • (1) any-order, slop=0, "b", "x", "b"
      • (1) any-order, slop=1, "b", "x", "b"
      • (1) any-order, slop=2, "b", "x", "b"
      • (1) any-order, slop=3, "b", "x", "b"

      These ones match too many hits as well:

      • (1) any-order, slop=0, "b", "b"
      • (2) any-order, slop=1, "b", "b"

      Each of the above passes when using a phrase query (applying the slop, no in-order indication in phrase query).

      This seems related to a known overlapping spans issue - non-overlapping Span queries - as indicated by Hoss, so we might decide to close this bug after all, but I would like to at least have the junit that exposes the behavior in JIRA.

      1. LUCENE-3120.patch
        4 kB
        Doron Cohen
      2. LUCENE-3120.patch
        4 kB
        Doron Cohen
      3. LUCENE-3120.patch
        0.9 kB
        Steve Davids

        Issue Links

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Doron Cohen
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:

                Development