Lucene - Core
  1. Lucene - Core
  2. LUCENE-3758

Allow the ComplexPhraseQueryParser to search order or un-order proximity queries.

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: 4.8, 5.0
    • Component/s: core/queryparser
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The ComplexPhraseQueryParser use SpanNearQuery, but always set the "inOrder" value hardcoded to "true". This could be configurable.

      1. LUCENE-3758.patch
        5 kB
        Erick Erickson
      2. LUCENE-3758.patch
        5 kB
        Ahmet Arslan
      3. LUCENE-3758.patch
        5 kB
        Ahmet Arslan

        Issue Links

          Activity

          Hide
          Dmitry Kan added a comment -

          Erick Erickson right, agree, this should be handled in another jira as a local param. We have implemented this as an operator as we allow mixing ordered and unordered clauses in the same query.

          Show
          Dmitry Kan added a comment - Erick Erickson right, agree, this should be handled in another jira as a local param. We have implemented this as an operator as we allow mixing ordered and unordered clauses in the same query.
          Hide
          Erick Erickson added a comment -

          Thanks Ahmet!

          Show
          Erick Erickson added a comment - Thanks Ahmet!
          Hide
          Erick Erickson added a comment -

          Fixed:
          trunk: r - 1578113
          4x: r - 1578134

          Show
          Erick Erickson added a comment - Fixed: trunk: r - 1578113 4x: r - 1578134
          Hide
          Erick Erickson added a comment -

          Ahmet's patch plus entry in CHANGES.txt

          Show
          Erick Erickson added a comment - Ahmet's patch plus entry in CHANGES.txt
          Hide
          Erick Erickson added a comment -

          Just to make sure I understand Dimitry's comment about the # operator. I don't see anything in this patch on a quick look that references a new operator, so that's a separate issue, correct? I see in the related SOLR-1604 patch the ability to specify inOrder="true|false" as a local parameter, so this functionality is available at that level.

          Frankly, I'd rather not introduce a new operator at this stage, let's get the underlying functionality in place and treat any new operators as a separate issue if we add one it at all.

          Any responses to the comment by Robert Muir? My quick response is that I've seen use-cases like this:
          "Find all the variants of "john anderson, including 'jonathan anderson', 'jon ivan gregory anderson' but not 'eric anderson and jonathan jones' ". Contrived a bit, but you get the idea. Specifying slop doesn't allow this case, but slop with specified order does.

          I'm going to be committing this this, along with SOLR-1604 today unless there are objections. The patch doesn't change current behavior so it seems pretty safe.

          Show
          Erick Erickson added a comment - Just to make sure I understand Dimitry's comment about the # operator. I don't see anything in this patch on a quick look that references a new operator, so that's a separate issue, correct? I see in the related SOLR-1604 patch the ability to specify inOrder="true|false" as a local parameter, so this functionality is available at that level. Frankly, I'd rather not introduce a new operator at this stage, let's get the underlying functionality in place and treat any new operators as a separate issue if we add one it at all. Any responses to the comment by Robert Muir ? My quick response is that I've seen use-cases like this: "Find all the variants of "john anderson, including 'jonathan anderson', 'jon ivan gregory anderson' but not 'eric anderson and jonathan jones' ". Contrived a bit, but you get the idea. Specifying slop doesn't allow this case, but slop with specified order does. I'm going to be committing this this, along with SOLR-1604 today unless there are objections. The patch doesn't change current behavior so it seems pretty safe.
          Hide
          Ahmet Arslan added a comment -

          patch for trunk (revision 1577942)

          Show
          Ahmet Arslan added a comment - patch for trunk (revision 1577942)
          Hide
          Steve Rowe added a comment -

          Bulk move 4.4 issues to 4.5 and 5.0

          Show
          Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
          Hide
          Tomás Fernández Löbbe added a comment -

          I think that makes sense. The query is different so it should have a different syntax.

          Show
          Tomás Fernández Löbbe added a comment - I think that makes sense. The query is different so it should have a different syntax.
          Hide
          Dmitry Kan added a comment - - edited

          That's correct Tomás. We have already internally tested this operator and it works just fine.

          Show
          Dmitry Kan added a comment - - edited That's correct Tomás. We have already internally tested this operator and it works just fine.
          Hide
          Tomás Fernández Löbbe added a comment -

          Basically, a search like '"foo bar"#2' would match documents with the terms "foo" and "bar" with up to 2 positions of distance from each other and only if "foo" is before "bar"?

          Show
          Tomás Fernández Löbbe added a comment - Basically, a search like '"foo bar"#2' would match documents with the terms "foo" and "bar" with up to 2 positions of distance from each other and only if "foo" is before "bar"?
          Hide
          Dmitry Kan added a comment -

          Implemented new query operator "#", that allows to do what's described in my previous message. Let me know, if someone needs a "patch" for this.

          Show
          Dmitry Kan added a comment - Implemented new query operator "#", that allows to do what's described in my previous message. Let me know, if someone needs a "patch" for this.
          Hide
          Dmitry Kan added a comment -

          Hello!

          If I were to implement two different versions of span near queries: with order and without order, would this class be the right point to start?
          I'm thinking to add support for new query operator that would support order of terms in the near query, as (if I correctly understand), "~" operator doesn't preserve the order.

          Show
          Dmitry Kan added a comment - Hello! If I were to implement two different versions of span near queries: with order and without order, would this class be the right point to start? I'm thinking to add support for new query operator that would support order of terms in the near query, as (if I correctly understand), "~" operator doesn't preserve the order.
          Hide
          Robert Muir added a comment -

          ok guys try to bare with me, since I don't know this thing at all:

          Would this apply to both exact and sloppy phrase queries?

          It seems to me, that instead of hardcoding inOrder to true, we should only set inOrder=true if its an exact phrase query
          But if the user (not this qp itself, but the user actually used ~) supplied slop, then inOrder should be false.

          This would better emulate the behavior of the regular lucene queryparser... I'm wondering if we even need an option
          since it just seems like the way it should work.

          Show
          Robert Muir added a comment - ok guys try to bare with me, since I don't know this thing at all: Would this apply to both exact and sloppy phrase queries? It seems to me, that instead of hardcoding inOrder to true, we should only set inOrder=true if its an exact phrase query But if the user (not this qp itself, but the user actually used ~) supplied slop, then inOrder should be false. This would better emulate the behavior of the regular lucene queryparser... I'm wondering if we even need an option since it just seems like the way it should work.
          Hide
          Ahmet Arslan added a comment -

          patch for trunk

          Show
          Ahmet Arslan added a comment - patch for trunk

            People

            • Assignee:
              Erick Erickson
              Reporter:
              Tomás Fernández Löbbe
            • Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development