Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7466

Allow optional leading wildcards in complexphrase

    Details

      Description

      Currently ComplexPhraseQParser (SOLR-1604) allows trailing wildcards on terms in a phrase, but does not allow leading wildcards. I would like the option to be able to search for terms with both trailing and leading wildcards.

      For example with:

      {!complexphrase allowLeadingWildcard=true}

      "j* *th"
      would match "John Smith", "Jim Smith", but not "John Schmitt"

      1. SOLR-7466.patch
        12 kB
        Mikhail Khludnev
      2. SOLR-7466.patch
        5 kB
        Mikhail Khludnev

        Issue Links

          Activity

          Hide
          fenra Andy hardin added a comment -

          I am currently working on a plugin for our usage which will be a version of ComplexPhraseQParser with lparser.setAllowLeadingWildcard(true);. I'd like to see the option as shown in my example to be added to ComplexPhraseQParser, though. I also have a patch in the works, but want to make sure I have good coverage for it.

          Show
          fenra Andy hardin added a comment - I am currently working on a plugin for our usage which will be a version of ComplexPhraseQParser with lparser.setAllowLeadingWildcard(true); . I'd like to see the option as shown in my example to be added to ComplexPhraseQParser, though. I also have a patch in the works, but want to make sure I have good coverage for it.
          Hide
          martinleopold Martin Leopold added a comment -

          Hi,
          Any news on this issue? Andy: if you need some testers for your patch, I'd love to give it a spin.

          Br,
          Martin

          Show
          martinleopold Martin Leopold added a comment - Hi, Any news on this issue? Andy: if you need some testers for your patch, I'd love to give it a spin. Br, Martin
          Hide
          amundsen Jon Kjær Amundsen added a comment -

          Hi Andy
          I'm from Denmark where we excel in compund words.
          Therefore your plugin could really be of use to me. If it's ready for test let me know.

          Show
          amundsen Jon Kjær Amundsen added a comment - Hi Andy I'm from Denmark where we excel in compund words. Therefore your plugin could really be of use to me. If it's ready for test let me know.
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          I tried to hook up ReversedWildcardFilterFactory SOLR-1321 in {!complexphrase}. The problem is that

           ComplexPhraseQueryParser --|> o.a.lucene.q.c.QueryParser
          

          But a cool code, which leverages ReversedWildcardFilterFactory resides in o.a.solr...SolrQueryParserBase.getWildcardQuery(String, String).
          So far, I copy-pasted CPQP to make it descendant of Solr's parser. It works, but it's not cute. Is there a better idea? Is it really necessary to deal with RWFF?

          Show
          mkhludnev Mikhail Khludnev added a comment - I tried to hook up ReversedWildcardFilterFactory SOLR-1321 in {!complexphrase}. The problem is that ComplexPhraseQueryParser --|> o.a.lucene.q.c.QueryParser But a cool code, which leverages ReversedWildcardFilterFactory resides in o.a.solr...SolrQueryParserBase.getWildcardQuery(String, String) . So far, I copy-pasted CPQP to make it descendant of Solr's parser. It works, but it's not cute. Is there a better idea? Is it really necessary to deal with RWFF?
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          Or how to move ReversedWildcardFilterFactory into Lucene and let Lucene's query parser to detect it in analysis chain like it's done in o.a.solr.p.SolrQueryParserBase.getReversedWildcardFilterFactory(FieldType) ? Isn't there anything in Lucene which gets on par with Solr's one?

          Show
          mkhludnev Mikhail Khludnev added a comment - Or how to move ReversedWildcardFilterFactory into Lucene and let Lucene's query parser to detect it in analysis chain like it's done in o.a.solr.p.SolrQueryParserBase.getReversedWildcardFilterFactory(FieldType) ? Isn't there anything in Lucene which gets on par with Solr's one?
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          What would you think about SOLR-7466.patch? We can add SolrQP as mixing to Lucene's ComplexPhraseQP and delegate wildcards to the former!

          Show
          mkhludnev Mikhail Khludnev added a comment - What would you think about SOLR-7466.patch ? We can add SolrQP as mixing to Lucene's ComplexPhraseQP and delegate wildcards to the former!
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          Is there any veto to make leading wildcards always on in complexphrase?

          Show
          mkhludnev Mikhail Khludnev added a comment - Is there any veto to make leading wildcards always on in complexphrase?
          Hide
          erickerickson Erick Erickson added a comment -

          Mikhail:

          I haven't looked at the code, so take this with a large grain of salt.

          My only concern is if having this on by default would cause a full terms scan. If having this always on by default means a user can use a leading wildcard without specifying ReverseWildcardFilterFactory in the analysis chain then it seems trappy.

          That said, I'll defer to your familiarity with the, you know, actual code.

          Show
          erickerickson Erick Erickson added a comment - Mikhail: I haven't looked at the code, so take this with a large grain of salt. My only concern is if having this on by default would cause a full terms scan. If having this always on by default means a user can use a leading wildcard without specifying ReverseWildcardFilterFactory in the analysis chain then it seems trappy. That said, I'll defer to your familiarity with the, you know, actual code.
          Hide
          yseeley@gmail.com Yonik Seeley added a comment -

          Is there any veto to make leading wildcards always on in complexphrase?

          Seems fine to me. This is limited to the complexphase qparser to begin with, and what makes sense for a default is to return what is requested without second guessing.

          Show
          yseeley@gmail.com Yonik Seeley added a comment - Is there any veto to make leading wildcards always on in complexphrase? Seems fine to me. This is limited to the complexphase qparser to begin with, and what makes sense for a default is to return what is requested without second guessing.
          Hide
          mkhludnev Mikhail Khludnev added a comment -

          attaching SOLR-7466.patch, which also covers SOLR-9900 case.

          Show
          mkhludnev Mikhail Khludnev added a comment - attaching SOLR-7466.patch , which also covers SOLR-9900 case.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit d3f83bb948fd44e66099ef9537363ecef5bdb0f3 in lucene-solr's branch refs/heads/master from Mikhail Khludnev
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d3f83bb ]

          SOLR-7466: reverse-aware leading wildcards in complexphrase query parser

          Show
          jira-bot ASF subversion and git services added a comment - Commit d3f83bb948fd44e66099ef9537363ecef5bdb0f3 in lucene-solr's branch refs/heads/master from Mikhail Khludnev [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d3f83bb ] SOLR-7466 : reverse-aware leading wildcards in complexphrase query parser
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit e5d063bdbbabec47dc5f53db4a2b35bbd20b0699 in lucene-solr's branch refs/heads/branch_6x from Mikhail Khludnev
          [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e5d063b ]

          SOLR-7466: reverse-aware leading wildcards in complexphrase query parser

          Show
          jira-bot ASF subversion and git services added a comment - Commit e5d063bdbbabec47dc5f53db4a2b35bbd20b0699 in lucene-solr's branch refs/heads/branch_6x from Mikhail Khludnev [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e5d063b ] SOLR-7466 : reverse-aware leading wildcards in complexphrase query parser

            People

            • Assignee:
              mkhludnev Mikhail Khludnev
              Reporter:
              fenra Andy hardin
            • Votes:
              3 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development