Solr
  1. Solr
  2. SOLR-7143

MoreLikeThis Query Parser does not handle multiple field names

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 5.0
    • Fix Version/s: 5.3
    • Component/s: query parsers
    • Labels:
      None

      Description

      The newly introduced MoreLikeThis Query Parser (SOLR-6248) does not return any results when supplied with multiple fields in the qf parameter.

      To reproduce within the techproducts example, compare:

      curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name%7DMA147LL/A'
      curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=features%7DMA147LL/A'
      curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=name,features%7DMA147LL/A'
      

      The first two queries return 8 and 5 results, respectively. The third query doesn't return any results (not even the matched document).

      In contrast, the MoreLikeThis Handler works as expected (accounting for the default mintf and mindf values in SimpleMLTQParser):

      curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/A&mlt.fl=name&mlt.mintf=1&mlt.mindf=1'
      curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/A&mlt.fl=features&mlt.mintf=1&mlt.mindf=1'
      curl 'http://localhost:8983/solr/techproducts/mlt?q=id:MA147LL/A&mlt.fl=name,features&mlt.mintf=1&mlt.mindf=1'
      

      After adding the following line to example/techproducts/solr/techproducts/conf/solrconfig.xml:

      <requestHandler name="/mlt" class="solr.MoreLikeThisHandler" />
      

      The first two queries return 7 and 4 results, respectively (excluding the matched document). The third query returns 7 results, as one would expect.

      1. SOLR-7143.patch
        17 kB
        Anshum Gupta
      2. SOLR-7143.patch
        11 kB
        Anshum Gupta
      3. SOLR-7143.patch
        5 kB
        Jens Wille
      4. SOLR-7143.patch
        10 kB
        Vitaliy Zhovtyuk
      5. SOLR-7143.patch
        5 kB
        Vitaliy Zhovtyuk

        Issue Links

          Activity

          Hide
          Vitaliy Zhovtyuk added a comment -

          Local parameters are not support multiple values syntax like:

          {!mlt qf=field1 qf=field2}

          , but qf list is required in MoreLikeThis.
          Added support for comma separated fields:

          {!mlt qf=field1,field2}

          Also comparing MLT handler query parser does not have any boost support on fields. This can be extended in qf parameter syntax.

          Show
          Vitaliy Zhovtyuk added a comment - Local parameters are not support multiple values syntax like: {!mlt qf=field1 qf=field2} , but qf list is required in MoreLikeThis. Added support for comma separated fields: {!mlt qf=field1,field2} Also comparing MLT handler query parser does not have any boost support on fields. This can be extended in qf parameter syntax.
          Hide
          Anshum Gupta added a comment -

          You've also added support for

          {!mlt qf=field1 qf=field2}

          Let's not do this specifically for this issue but have wider support for multi valued local params.

          Also, parseLocalParamsSplitted returns a new String array but it should instead just return null like getParam(paramName). Returning a non-null value always ensures that the code flow never hits:

          String[] qf = parseLocalParamsSplitted("qf"); // Never returns null
          
          if (qf != null) { }
          else { // Never gets here }
          
          Show
          Anshum Gupta added a comment - You've also added support for {!mlt qf=field1 qf=field2} Let's not do this specifically for this issue but have wider support for multi valued local params. Also, parseLocalParamsSplitted returns a new String array but it should instead just return null like getParam(paramName). Returning a non-null value always ensures that the code flow never hits: String [] qf = parseLocalParamsSplitted( "qf" ); // Never returns null if (qf != null ) { } else { // Never gets here }
          Hide
          Vitaliy Zhovtyuk added a comment -

          Multiple local parameters will not work in previous patch because Map<String, String> as result of local params parsing. Replaced it with Map<String, String[]> and MultiMapSolrParams in org.apache.solr.search.QParser and in usages. Also removed parseLocalParams (comma split params support could be complicated in case boost syntax in qf).

          Show
          Vitaliy Zhovtyuk added a comment - Multiple local parameters will not work in previous patch because Map<String, String> as result of local params parsing. Replaced it with Map<String, String[]> and MultiMapSolrParams in org.apache.solr.search.QParser and in usages. Also removed parseLocalParams (comma split params support could be complicated in case boost syntax in qf).
          Hide
          Jens Wille added a comment - - edited

          First of all, thanks to Vitaliy for providing patches and to Anshum for picking up this issue.

          Now I'm wondering what needs to happen to move this forward. I'm new here and don't really know what's expected of me. I have verified that the latest patch works for this particular issue and I, too, would like to see the query parser functionality brought on par with the handler functionality (parameters and defaults, including an mlt.match.include"exclude current document from results" equivalent).

          However, there's still a problem with the current approach (multiple qf parameters: yes, comma-separated list: no): Only the latter would work properly with parameter dereferencing:

          curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=$mlt.fl%7DMA147LL/A&mlt.fl=name,features'
          

          Sets qf=name,features, which is not split into name and features with the latest patch.

          curl 'http://localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=$mlt.fl%7DMA147LL/A&mlt.fl=name&mlt.fl=features'
          

          Sets qf=name, ignores subsequent mlt.fl parameters.

          Please let me know if I can do anything.

          Show
          Jens Wille added a comment - - edited First of all, thanks to Vitaliy for providing patches and to Anshum for picking up this issue. Now I'm wondering what needs to happen to move this forward. I'm new here and don't really know what's expected of me. I have verified that the latest patch works for this particular issue and I, too, would like to see the query parser functionality brought on par with the handler functionality (parameters and defaults, including an mlt.match.include "exclude current document from results" equivalent). However, there's still a problem with the current approach (multiple qf parameters: yes, comma-separated list: no): Only the latter would work properly with parameter dereferencing: curl 'http: //localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=$mlt.fl%7DMA147LL/A&mlt.fl=name,features' Sets qf=name,features , which is not split into name and features with the latest patch. curl 'http: //localhost:8983/solr/techproducts/select?q=%7B!mlt+qf=$mlt.fl%7DMA147LL/A&mlt.fl=name&mlt.fl=features' Sets qf=name , ignores subsequent mlt.fl parameters. Please let me know if I can do anything.
          Hide
          Anshum Gupta added a comment -

          I'll try and wrap it up this week and will also link this to SOLR-2798.

          Show
          Anshum Gupta added a comment - I'll try and wrap it up this week and will also link this to SOLR-2798 .
          Hide
          Jens Wille added a comment -

          Hi Anshum, can you say anything about the status of this issue? Can you give me any pointers as to what I might be able to do?

          Show
          Jens Wille added a comment - Hi Anshum, can you say anything about the status of this issue? Can you give me any pointers as to what I might be able to do?
          Hide
          Anshum Gupta added a comment - - edited

          Hi Jens, Sorry but I haven't been able to get to this all this while.

          Here's what we need to get working:

          1. Way to specify multiple values for a field within the local params. e.g.:
            SOLR-2798 would solve this
            http://localhost:8983/solr/techproducts/select?q={!mlt qf=foo qf=bar}docid
            
          2. We also need to support parameter dereferencing as you suggested, considering we don't want to get involved with commas:
            http://localhost:8983/solr/techproducts/select?q={!mlt qf=$mlt.fl}docid&mlt.fl=foo&mlt.fl=bar
            

            Supporting comma's would interfere with the syntax used for things like bf e.g. bf=recip(rord(creationDate),1,1000,1000)

          If you have time and the motivation, it'd be great if you contribute a patch for this. We may already have parts of it from the existing patch.

          Show
          Anshum Gupta added a comment - - edited Hi Jens, Sorry but I haven't been able to get to this all this while. Here's what we need to get working: Way to specify multiple values for a field within the local params. e.g.: SOLR-2798 would solve this http: //localhost:8983/solr/techproducts/select?q={!mlt qf=foo qf=bar}docid We also need to support parameter dereferencing as you suggested, considering we don't want to get involved with commas: http: //localhost:8983/solr/techproducts/select?q={!mlt qf=$mlt.fl}docid&mlt.fl=foo&mlt.fl=bar Supporting comma's would interfere with the syntax used for things like bf e.g. bf=recip(rord(creationDate),1,1000,1000) If you have time and the motivation, it'd be great if you contribute a patch for this. We may already have parts of it from the existing patch.
          Hide
          Jens Wille added a comment -

          FYI: I'm certainly motivated to take a stab at this, just haven't found the time yet.

          Show
          Jens Wille added a comment - FYI: I'm certainly motivated to take a stab at this, just haven't found the time yet.
          Hide
          Jens Wille added a comment -

          I've added a new patch that mimics the behaviour of the MoreLikeThis Handler. So it only supports multiple values (in a comma-separated list) for the qf parameter. I also added the check for at least one similarity field from the handler.

          This works for my use case. I think general support for multiple local parameters is out of scope for this ticket and not required to make the MoreLikeThis Query Parser work like the MoreLikeThis Handler.

          Show
          Jens Wille added a comment - I've added a new patch that mimics the behaviour of the MoreLikeThis Handler. So it only supports multiple values (in a comma-separated list) for the qf parameter. I also added the check for at least one similarity field from the handler. This works for my use case. I think general support for multiple local parameters is out of scope for this ticket and not required to make the MoreLikeThis Query Parser work like the MoreLikeThis Handler.
          Hide
          Jens Wille added a comment -

          BTW: There's a lot of code duplication between CloudMLTQParser, SimpleMLTQParser and MoreLikeThisHandler. Maybe I can try to extract common pieces of code into one or more helpers if you're interested (and tell me where those helpers should live).

          Show
          Jens Wille added a comment - BTW: There's a lot of code duplication between CloudMLTQParser , SimpleMLTQParser and MoreLikeThisHandler . Maybe I can try to extract common pieces of code into one or more helpers if you're interested (and tell me where those helpers should live).
          Hide
          Anshum Gupta added a comment -

          Thanks for the patch Jens. I'll just refactor this.

          Also, I'm handling the multi-valued local params in SOLR-2798.

          Show
          Anshum Gupta added a comment - Thanks for the patch Jens. I'll just refactor this. Also, I'm handling the multi-valued local params in SOLR-2798 .
          Hide
          Anshum Gupta added a comment -

          Fixed an NPE and added a test. Will test a bit more and commit.

          Show
          Anshum Gupta added a comment - Fixed an NPE and added a test. Will test a bit more and commit.
          Hide
          Otis Gospodnetic added a comment -

          Anshum Gupta - which Solr version is this going in? Fix Version is empty. Thanks.

          Show
          Otis Gospodnetic added a comment - Anshum Gupta - which Solr version is this going in? Fix Version is empty. Thanks.
          Hide
          Anshum Gupta added a comment -

          Should be in 5.3. I'm pretty close to committing this. Just wanted to run a few tests before I did that.

          Show
          Anshum Gupta added a comment - Should be in 5.3. I'm pretty close to committing this. Just wanted to run a few tests before I did that.
          Hide
          ASF subversion and git services added a comment -

          Commit 1689531 from Anshum Gupta in branch 'dev/trunk'
          [ https://svn.apache.org/r1689531 ]

          SOLR-7143: MoreLikeThis Query parser now handles multiple field names

          Show
          ASF subversion and git services added a comment - Commit 1689531 from Anshum Gupta in branch 'dev/trunk' [ https://svn.apache.org/r1689531 ] SOLR-7143 : MoreLikeThis Query parser now handles multiple field names
          Hide
          ASF subversion and git services added a comment -

          Commit 1689556 from Anshum Gupta in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1689556 ]

          SOLR-7143: MoreLikeThis Query parser now handles multiple field names (merge from trunk)

          Show
          ASF subversion and git services added a comment - Commit 1689556 from Anshum Gupta in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1689556 ] SOLR-7143 : MoreLikeThis Query parser now handles multiple field names (merge from trunk)
          Hide
          Jens Wille added a comment -

          Anshum, thank you so much for committing this. I'm looking forward to finally being able to use the new MoreLikeThis Query Parser in 5.3

          Show
          Jens Wille added a comment - Anshum, thank you so much for committing this. I'm looking forward to finally being able to use the new MoreLikeThis Query Parser in 5.3
          Hide
          Anshum Gupta added a comment -

          Thanks Jens!

          Show
          Anshum Gupta added a comment - Thanks Jens!
          Hide
          Shalin Shekhar Mangar added a comment -

          Bulk close for 5.3.0 release

          Show
          Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release

            People

            • Assignee:
              Anshum Gupta
              Reporter:
              Jens Wille
            • Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development