Solr
  1. Solr
  2. SOLR-6311

SearchHandler should use path when no qt or shard.qt parameter is specified

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.9
    • Fix Version/s: 5.1, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      When performing distributed searches, you have to specify shards.qt unless you're on the default /select path for your handler. As this is configurable, even the default search handler could be on another path. The shard requests should thus default to the path if no shards.qt was specified.

      1. SOLR-6311.patch
        5 kB
        Timothy Potter
      2. SOLR-6311.patch
        1 kB
        Steve Molloy

        Activity

        Hide
        Steve Molloy added a comment -

        This patch will use shards.qt if specified, default to qt if not, then default to path if both were omitted.

        Show
        Steve Molloy added a comment - This patch will use shards.qt if specified, default to qt if not, then default to path if both were omitted.
        Hide
        David Smiley added a comment -

        Strong +1 from me; I have no idea why it wasn't this way from the beginning.

        Show
        David Smiley added a comment - Strong +1 from me; I have no idea why it wasn't this way from the beginning.
        Hide
        Yonik Seeley added a comment -

        When performing distributed searches, you have to specify shards.qt unless you're on the default /select path for your handler.

        This was by design, and we should consider very carefully before changing it.
        Most of the time, other handlers were used to add default parameters. When this is the case, it's preferable to hit a bare /select handler for sub-requests as hitting the same handler again and adding defaults again will have a lot of side effects and sometimes produce incorrect distributed results. The worst is when a handler specifies shards or something, and this causes an endless loop.

        Show
        Yonik Seeley added a comment - When performing distributed searches, you have to specify shards.qt unless you're on the default /select path for your handler. This was by design, and we should consider very carefully before changing it. Most of the time, other handlers were used to add default parameters. When this is the case, it's preferable to hit a bare /select handler for sub-requests as hitting the same handler again and adding defaults again will have a lot of side effects and sometimes produce incorrect distributed results. The worst is when a handler specifies shards or something, and this causes an endless loop.
        Hide
        David Smiley added a comment -

        Good point Yonik; I forgot about that. It's not this simple then. In general, the need to specify shards.qt is to ensure one's customized components (e.g. spellcheck) are registered; it's not for request parameters. Perhaps the solution is to have a shard request ignore parameters specified in the request handler?

        Show
        David Smiley added a comment - Good point Yonik; I forgot about that. It's not this simple then. In general, the need to specify shards.qt is to ensure one's customized components (e.g. spellcheck) are registered; it's not for request parameters. Perhaps the solution is to have a shard request ignore parameters specified in the request handler?
        Hide
        Steve Molloy added a comment -

        Thanks Yonik, hadn't thought of that. I'll think about alternatives for this as. Our main usage of different handlers is for using different sets of components (the suggester in its own handler for autocomplete being an example), in which case it seems wrong to force the request to contain shards.qt for distributed searches when we're trying to hide the fact that it's distributed by using collections.

        Show
        Steve Molloy added a comment - Thanks Yonik, hadn't thought of that. I'll think about alternatives for this as. Our main usage of different handlers is for using different sets of components (the suggester in its own handler for autocomplete being an example), in which case it seems wrong to force the request to contain shards.qt for distributed searches when we're trying to hide the fact that it's distributed by using collections.
        Hide
        Steve Molloy added a comment -

        Haven't had much time to look into alternate solution yet, but for the point of default parameters. What is then the impact of having default parameters on /select handler itself? Is there some limitations on parameters that should be set as defaults in /select handler? Even example config comes with defaults and suggests that more can be added.

        Show
        Steve Molloy added a comment - Haven't had much time to look into alternate solution yet, but for the point of default parameters. What is then the impact of having default parameters on /select handler itself? Is there some limitations on parameters that should be set as defaults in /select handler? Even example config comes with defaults and suggests that more can be added.
        Hide
        Timothy Potter added a comment -

        Same issue I stumbled upon trying to fix SOLR-4479.

        Most of the time, other handlers were used to add default parameters.

        True, but sometimes they also add search components so silently routing shard requests to /select is just plain wrong.

        Seems like maybe the the hook to determine if the path should be applied as the default for shards.qt is whether there are custom components defined on that instance of SearchHandler? Seems a little nuanced but better for new users than the current experience.

        Show
        Timothy Potter added a comment - Same issue I stumbled upon trying to fix SOLR-4479 . Most of the time, other handlers were used to add default parameters. True, but sometimes they also add search components so silently routing shard requests to /select is just plain wrong. Seems like maybe the the hook to determine if the path should be applied as the default for shards.qt is whether there are custom components defined on that instance of SearchHandler? Seems a little nuanced but better for new users than the current experience.
        Hide
        Hoss Man added a comment -

        doesn't the initParam stuff noble added improve the situation with vanity handler paths that have defaults? and doesn't it play nice with distibuted requests? (so that it only adds the defaults on the initial request, not the per-shard requests – if not it should)

        i think we should just bite the bullet on making the switch...

        straw man:

        • make behavior conditional on LUCENE_MATCH_VERSION
          • if < X, shards.qt defaults to /select
          • if > X, shards.qt defaults to current handler
        • users who want to upgrade LUCENE_MATCH_VERSION but still get legacy behavior for vanity handler defaults should make hte burden of adding shards.qt=/select to the "invariants" for each of their vanity handlers
        Show
        Hoss Man added a comment - doesn't the initParam stuff noble added improve the situation with vanity handler paths that have defaults? and doesn't it play nice with distibuted requests? (so that it only adds the defaults on the initial request, not the per-shard requests – if not it should) i think we should just bite the bullet on making the switch... straw man: make behavior conditional on LUCENE_MATCH_VERSION if < X, shards.qt defaults to /select if > X, shards.qt defaults to current handler users who want to upgrade LUCENE_MATCH_VERSION but still get legacy behavior for vanity handler defaults should make hte burden of adding shards.qt=/select to the "invariants" for each of their vanity handlers
        Hide
        Noble Paul added a comment -

        Hoss Man
        With InitParams , the behavior is exacly similar as if we have specified the the config in the requestHandler itself

        But, as you said, I expect users to configure a lot more paths with the useParams feature with each mapping to a SearchHandler itself but a different set of params and a nice descriptive name. Coupled with the fact that you can change the params themselves pretty easily it will be more common. So you may see paths like /bestproducts-by-rating /best-products-by-sale etc.

        So it should not really hurt moving to shards.qt=qt

        Show
        Noble Paul added a comment - Hoss Man With InitParams , the behavior is exacly similar as if we have specified the the config in the requestHandler itself But, as you said, I expect users to configure a lot more paths with the useParams feature with each mapping to a SearchHandler itself but a different set of params and a nice descriptive name. Coupled with the fact that you can change the params themselves pretty easily it will be more common. So you may see paths like /bestproducts-by-rating /best-products-by-sale etc. So it should not really hurt moving to shards.qt=qt
        Hide
        David Smiley added a comment -

        i think we should just bite the bullet on making the switch...

        +1 to that! It should have been done this way to begin with. I consider it a bug that distributed requests were apparently hard-coded to use /select

        Show
        David Smiley added a comment - i think we should just bite the bullet on making the switch... +1 to that! It should have been done this way to begin with. I consider it a bug that distributed requests were apparently hard-coded to use /select
        Hide
        Hoss Man added a comment -

        It should have been done this way to begin with. I consider it a bug that distributed requests were apparently hard-coded to use /select

        Definitely not a bug.

        you have to remember the context of how distributed search was added – prior to SolrCloud, you had to specify a "shards" param listing all of the cores you wanted to do a distributed search over, and the primary convinience mechanism for doing that was to register a handler like this...

        <requestHandler name="/my_handler" class="solr.SearchHandler"/>
          <lst name="defaults">
            <str name="shards">foo:8983/solr,bar:8983/solr</str>
            <int name="rows">100</int>
          </lst>
        </requestHandler>
        

        ...so the choice to have "shards.qt" default to "/select" instead of the current qt was extremely important to make distributed search function correctly for most users for multiple reasons:

        1) so that the shards param wouldn't cause infinite recursion
        2) so that the "defaults" wouldn't be automatically inherited by the per-shard requests

        But now is not then – the default behavior of shards.qt should change to make the most sense given the features and best practice currently available in Solr. SolrCloud solves #1, and IIUC useParams solves #2, so we can move forward.

        Show
        Hoss Man added a comment - It should have been done this way to begin with. I consider it a bug that distributed requests were apparently hard-coded to use /select Definitely not a bug. you have to remember the context of how distributed search was added – prior to SolrCloud, you had to specify a "shards" param listing all of the cores you wanted to do a distributed search over, and the primary convinience mechanism for doing that was to register a handler like this... <requestHandler name="/my_handler" class="solr.SearchHandler"/> <lst name="defaults"> <str name="shards">foo:8983/solr,bar:8983/solr</str> <int name="rows">100</int> </lst> </requestHandler> ...so the choice to have "shards.qt" default to "/select" instead of the current qt was extremely important to make distributed search function correctly for most users for multiple reasons: 1) so that the shards param wouldn't cause infinite recursion 2) so that the "defaults" wouldn't be automatically inherited by the per-shard requests But now is not then – the default behavior of shards.qt should change to make the most sense given the features and best practice currently available in Solr. SolrCloud solves #1, and IIUC useParams solves #2, so we can move forward.
        Hide
        Timothy Potter added a comment -

        I'm going with Hoss Man's suggestion of using the LUCENE_MATCH_VERSION and am targeting this fix for the 5.1 release. So my first inclination was to do:

        if (req.getCore().getSolrConfig().luceneMatchVersion.onOrAfter(Version.LUCENE_5_1_0)) {
         ...
        

        But Version.LUCENE_5_1_0 is deprecated, so do I do this instead?

        if (req.getCore().getSolrConfig().luceneMatchVersion.onOrAfter(Version.LATEST)) {
        ...
        

        I guess it's the deprecated thing that's throwing me off.

        Show
        Timothy Potter added a comment - I'm going with Hoss Man 's suggestion of using the LUCENE_MATCH_VERSION and am targeting this fix for the 5.1 release. So my first inclination was to do: if (req.getCore().getSolrConfig().luceneMatchVersion.onOrAfter(Version.LUCENE_5_1_0)) { ... But Version.LUCENE_5_1_0 is deprecated, so do I do this instead? if (req.getCore().getSolrConfig().luceneMatchVersion.onOrAfter(Version.LATEST)) { ... I guess it's the deprecated thing that's throwing me off.
        Hide
        Timothy Potter added a comment -

        nm! If I look at branch5x, my question is answered sometimes you have to look outside of trunk to see clearly!

        Show
        Timothy Potter added a comment - nm! If I look at branch5x, my question is answered sometimes you have to look outside of trunk to see clearly!
        Hide
        Timothy Potter added a comment -

        Patch that implements the conditional logic based on the luceneMatchVersion. I'm intending this fix to be included in 5.1. The TermVectorComponentDistributedTest test now works without specifying the shards.qt query param. Feedback welcome!

        Show
        Timothy Potter added a comment - Patch that implements the conditional logic based on the luceneMatchVersion. I'm intending this fix to be included in 5.1. The TermVectorComponentDistributedTest test now works without specifying the shards.qt query param. Feedback welcome!
        Hide
        Steve Molloy added a comment -

        Definitely not a bug. you have to remember the context of how distributed search was added

        Thanks for the history, makes it clearer why it was needed.

        But now is not then

        Indeed, now distributed/SolrCloud is pretty much the norm...

        So anyhow, patch with logic on version makes sense for me, so +1.

        Show
        Steve Molloy added a comment - Definitely not a bug. you have to remember the context of how distributed search was added Thanks for the history, makes it clearer why it was needed. But now is not then Indeed, now distributed/SolrCloud is pretty much the norm... So anyhow, patch with logic on version makes sense for me, so +1.
        Hide
        ASF subversion and git services added a comment -

        Commit 1659694 from Timothy Potter in branch 'dev/trunk'
        [ https://svn.apache.org/r1659694 ]

        SOLR-6311: SearchHandler should use path when no qt or shard.qt parameter is specified

        Show
        ASF subversion and git services added a comment - Commit 1659694 from Timothy Potter in branch 'dev/trunk' [ https://svn.apache.org/r1659694 ] SOLR-6311 : SearchHandler should use path when no qt or shard.qt parameter is specified
        Hide
        ASF subversion and git services added a comment -

        Commit 1659699 from Timothy Potter in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1659699 ]

        SOLR-6311: SearchHandler should use path when no qt or shard.qt parameter is specified

        Show
        ASF subversion and git services added a comment - Commit 1659699 from Timothy Potter in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1659699 ] SOLR-6311 : SearchHandler should use path when no qt or shard.qt parameter is specified
        Hide
        Timothy Potter added a comment -

        Bulk close after 5.1 release

        Show
        Timothy Potter added a comment - Bulk close after 5.1 release

          People

          • Assignee:
            Timothy Potter
            Reporter:
            Steve Molloy
          • Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development