Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-3120

span query matches too many docs when two query terms are the same unless inOrder=true

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.9, 6.0
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      spinoff of user list discussion - SpanNearQuery - inOrder parameter.

      With 3 documents:

      • "a b x c d"
      • "a b b d"
      • "a b x b y d"

      Here are a few queries (the number in parenthesis indicates expected #hits):

      These ones work as expected:

      • (1) in-order, slop=0, "b", "x", "b"
      • (1) in-order, slop=0, "b", "b"
      • (2) in-order, slop=1, "b", "b"

      These ones match too many hits:

      • (1) any-order, slop=0, "b", "x", "b"
      • (1) any-order, slop=1, "b", "x", "b"
      • (1) any-order, slop=2, "b", "x", "b"
      • (1) any-order, slop=3, "b", "x", "b"

      These ones match too many hits as well:

      • (1) any-order, slop=0, "b", "b"
      • (2) any-order, slop=1, "b", "b"

      Each of the above passes when using a phrase query (applying the slop, no in-order indication in phrase query).

      This seems related to a known overlapping spans issue - non-overlapping Span queries - as indicated by Hoss, so we might decide to close this bug after all, but I would like to at least have the junit that exposes the behavior in JIRA.

        Attachments

        1. LUCENE-3120.patch
          0.9 kB
          Steve Davids
        2. LUCENE-3120.patch
          4 kB
          Doron Cohen
        3. LUCENE-3120.patch
          4 kB
          Doron Cohen

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                doronc Doron Cohen
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: