Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-6832

Queries be served locally rather than being forwarded to another replica

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 4.10.2
    • 5.1, 6.0
    • SolrCloud
    • None

    Description

      Currently, I see that code flow for a query in SolrCloud is as follows:

      For distributed query:
      SolrCore -> SearchHandler.handleRequestBody() -> HttpShardHandler.submit()

      For non-distributed query:
      SolrCore -> SearchHandler.handleRequestBody() -> QueryComponent.process()




      For a distributed query, the request is always sent to all the shards even if the originating SolrCore (handling the original distributed query) is a replica of one of the shards.
      If the original Solr-Core can check itself before sending http requests for any shard, we can probably save some network hopping and gain some performance.



      We can change SearchHandler.handleRequestBody() or HttpShardHandler.submit() to fix this behavior (most likely the former and not the latter).

      Attachments

        1. SOLR-6832.patch
          14 kB
          Timothy Potter
        2. SOLR-6832.patch
          17 kB
          Sachin Goyal
        3. SOLR-6832.patch
          10 kB
          Sachin Goyal
        4. SOLR-6832.patch
          13 kB
          Sachin Goyal

        Issue Links

          Activity

            sachingoyal Sachin Goyal added a comment -

            The performance gain increases if coresPerMachine is > 1 and a single JVM has cores from 'k' shards.

            sachingoyal Sachin Goyal added a comment - The performance gain increases if coresPerMachine is > 1 and a single JVM has cores from 'k' shards.
            elyograg Shawn Heisey added a comment -

            I have concerns, but I don't want to derail the work. There are use-cases for which this would be very useful, but many other use-cases where it would cause a single machine to crumble under the load while other machines in the cloud are nearly idle.

            Duplicating what I said on the dev@l.a.o thread:

            Consider a SolrCloud that is handling 5000 requests per second with a replicationFactor of 20 or 30. This could be one shard or multiple shards. Currently, those requests will be load balanced to the entire cluster. If this option is implemented, suddenly EVERY request will have at least one part handled locally ... and unless the index is very tiny or 99 percent of the queries hit a Solr cache, one index core simply won't be able to handle 5000 queries per second. Getting a single machine capable of handling that load MIGHT be possible, but it would likely be VERY expensive.

            This would be great as an OPTION that can be enabled when the index composition and query patterns dictate it will be beneficial ... but it definitely should not be default behavior.

            elyograg Shawn Heisey added a comment - I have concerns, but I don't want to derail the work. There are use-cases for which this would be very useful, but many other use-cases where it would cause a single machine to crumble under the load while other machines in the cloud are nearly idle. Duplicating what I said on the dev@l.a.o thread: Consider a SolrCloud that is handling 5000 requests per second with a replicationFactor of 20 or 30. This could be one shard or multiple shards. Currently, those requests will be load balanced to the entire cluster. If this option is implemented, suddenly EVERY request will have at least one part handled locally ... and unless the index is very tiny or 99 percent of the queries hit a Solr cache, one index core simply won't be able to handle 5000 queries per second. Getting a single machine capable of handling that load MIGHT be possible, but it would likely be VERY expensive. This would be great as an OPTION that can be enabled when the index composition and query patterns dictate it will be beneficial ... but it definitely should not be default behavior.
            ayonsinha Ayon Sinha added a comment -

            Hi elyograg, I work with sachingoyal. The background of this patch is that, we have a cluster of 14 machines actually serving upwards of 5000 qps, and when one machine goes into a multi-second GC pause, it easily brings down the entire cluster. I know this is not the sole cause of the distributed deadlock and we definitely fixed other things like (gc pauses, thread counts etc) to reduce the likelihood of this problem.

            In the scenario that you mention, the load balancer outside SolrCloud is at fault and when that is the case we'd like it to take down only one replica rather than propagate the problem to other replicas.

            So to be clear, when this Option is ON, the only thing you'll "lose" is extra load balancing among the shard-queries. And frankly when I have all the shards in the same node, I prefer to NOT go over the network as network is among the most unreliable and taxed resource in cloud environments. When we go over the network to another compute, I have no idea what is carrying me over there and how is that other node doing overall.
            We will post our results on the benefit of having this option as ON.

            ayonsinha Ayon Sinha added a comment - Hi elyograg , I work with sachingoyal . The background of this patch is that, we have a cluster of 14 machines actually serving upwards of 5000 qps, and when one machine goes into a multi-second GC pause, it easily brings down the entire cluster. I know this is not the sole cause of the distributed deadlock and we definitely fixed other things like (gc pauses, thread counts etc) to reduce the likelihood of this problem. In the scenario that you mention, the load balancer outside SolrCloud is at fault and when that is the case we'd like it to take down only one replica rather than propagate the problem to other replicas. So to be clear, when this Option is ON, the only thing you'll "lose" is extra load balancing among the shard-queries. And frankly when I have all the shards in the same node, I prefer to NOT go over the network as network is among the most unreliable and taxed resource in cloud environments. When we go over the network to another compute, I have no idea what is carrying me over there and how is that other node doing overall. We will post our results on the benefit of having this option as ON.
            elyograg Shawn Heisey added a comment -

            That sounds like a perfect use-case for this option. In your setup, you have an external load balancer and are not relying on SolrCloud itself or the zookeeper-aware Java client (CloudSolrServer) to do the load balancing for you. For an environment like that, letting SolrCloud forward the request adds a completely unnecessary network hop, along with new Java objects and subsequent garbage that must be collected.

            This is why I said I didn't want to derail the work. If you have a solution, we should try to get it to a state where it can be committed. It is very clear that it will be an immense help for many users. I just don't want it to become the default.

            Trying to come up with a useful and descriptive option name that's not horribly long ... that's a challenge. Something like handleRequestsLocally may be too generic, but it's a lot shorter than handleShardRequestsLocallyIfPossible!

            elyograg Shawn Heisey added a comment - That sounds like a perfect use-case for this option. In your setup, you have an external load balancer and are not relying on SolrCloud itself or the zookeeper-aware Java client (CloudSolrServer) to do the load balancing for you. For an environment like that, letting SolrCloud forward the request adds a completely unnecessary network hop, along with new Java objects and subsequent garbage that must be collected. This is why I said I didn't want to derail the work. If you have a solution, we should try to get it to a state where it can be committed. It is very clear that it will be an immense help for many users. I just don't want it to become the default. Trying to come up with a useful and descriptive option name that's not horribly long ... that's a challenge. Something like handleRequestsLocally may be too generic, but it's a lot shorter than handleShardRequestsLocallyIfPossible!
            ayonsinha Ayon Sinha added a comment -

            Our clients actually do use CloudSolrServer (LB SolrJ client). Is there something we should be worrying about there? We are under the impression that the Zk aware CloudSolrServer is doing a round-robin load balancing sending query requests.

            We only intend to 'preferLocalShards' on the Solr node side only.

            BTW, how is the name 'preferLocalShards' ?

            ayonsinha Ayon Sinha added a comment - Our clients actually do use CloudSolrServer (LB SolrJ client). Is there something we should be worrying about there? We are under the impression that the Zk aware CloudSolrServer is doing a round-robin load balancing sending query requests. We only intend to 'preferLocalShards' on the Solr node side only. BTW, how is the name 'preferLocalShards' ?
            elyograg Shawn Heisey added a comment -

            CloudSolrServer does load balance, so you do not need an external load balancer. Internally, it uses changes in the zookeeper clusterstate to add and remove URLs on an instance of LBHttpSolrServer, which in turn uses HttpSolrServer for each of those URLs.

            https://lucene.apache.org/solr/4_10_2/solr-solrj/org/apache/solr/client/solrj/impl/LBHttpSolrServer.html

            The name preferLocalShards is perfect ... and I think a good case can be made for CloudSolrServer using this for queries (probably via a query URL parameter) by default.

            elyograg Shawn Heisey added a comment - CloudSolrServer does load balance, so you do not need an external load balancer. Internally, it uses changes in the zookeeper clusterstate to add and remove URLs on an instance of LBHttpSolrServer, which in turn uses HttpSolrServer for each of those URLs. https://lucene.apache.org/solr/4_10_2/solr-solrj/org/apache/solr/client/solrj/impl/LBHttpSolrServer.html The name preferLocalShards is perfect ... and I think a good case can be made for CloudSolrServer using this for queries (probably via a query URL parameter) by default.
            elyograg Shawn Heisey added a comment -

            we might even be able to shorten the parameter name to preferLocal, but that will require some further thought. I'd hate to have the shorter version be in use when another preferLocalXXX requirement comes up.

            elyograg Shawn Heisey added a comment - we might even be able to shorten the parameter name to preferLocal, but that will require some further thought. I'd hate to have the shorter version be in use when another preferLocalXXX requirement comes up.
            elyograg Shawn Heisey added a comment -

            A slightly better choice might be preferLocalReplicas ... but Shards is pretty good too.

            elyograg Shawn Heisey added a comment - A slightly better choice might be preferLocalReplicas ... but Shards is pretty good too.
            sachingoyal Sachin Goyal added a comment -

            Attaching a patch for this ticket.

            The patch creates an option preferLocalShards in solrconfig.xml and in the query request params (giving more preference to the one in the query).

            If this option is set, HttpShardHandler.preferCurrentHostForDistributedReq() tries to find a local URL and puts that URL as the first one in the list of URLs sent to LBHttpSolrServer. This ensures that the current host's cores will be given preference for distributed queries.

            Other important function is ResponseBuilder.findCurrentHostAddress() which tries to find the current host's URL by searching for current core's name in the list of shards. The URL found by this function is used by the HttpShardHandle's function above.

            Default value of the option is kept as 'false' to ensure normal behavior.

            sachingoyal Sachin Goyal added a comment - Attaching a patch for this ticket. The patch creates an option preferLocalShards in solrconfig.xml and in the query request params (giving more preference to the one in the query). If this option is set, HttpShardHandler.preferCurrentHostForDistributedReq() tries to find a local URL and puts that URL as the first one in the list of URLs sent to LBHttpSolrServer. This ensures that the current host's cores will be given preference for distributed queries. Other important function is ResponseBuilder.findCurrentHostAddress() which tries to find the current host's URL by searching for current core's name in the list of shards. The URL found by this function is used by the HttpShardHandle's function above. Default value of the option is kept as 'false' to ensure normal behavior.
            thelabdude Timothy Potter added a comment -

            sachingoyal Thanks for the patch! I'm working to get it to a committable state.

            I don't think adding preferLocalShards as a collection-level setting (in SolrConfig) adds much value here. If an operator wants to enforce that query parameter for all requests, they can use the built-in support for defaults or invariants on the appropriate query request handler, e.g. to make this the default on the /select handler, you could do:

              <requestHandler name="/select" class="solr.SearchHandler">
                 <lst name="defaults">
                   <str name="echoParams">explicit</str>
                   <int name="rows">10</int>
                   <bool name="preferLocalShards">true</bool>
                   ...
            

            Both approaches require some config changes in solrconfig.xml, but the latter (my suggestion) avoids adding new code / config settings. That said, please let me know if you think there's another reason to have this as an explicit setting in solrconfig.xml.

            Also, all the code in findCurrentHostAddress can simply be replaced by ZkController.getBaseUrl() when needed.

            thelabdude Timothy Potter added a comment - sachingoyal Thanks for the patch! I'm working to get it to a committable state. I don't think adding preferLocalShards as a collection-level setting (in SolrConfig) adds much value here. If an operator wants to enforce that query parameter for all requests, they can use the built-in support for defaults or invariants on the appropriate query request handler, e.g. to make this the default on the /select handler, you could do: <requestHandler name= "/select" class= "solr.SearchHandler" > <lst name= "defaults" > <str name= "echoParams" >explicit</str> < int name= "rows" >10</ int > <bool name= "preferLocalShards" > true </bool> ... Both approaches require some config changes in solrconfig.xml, but the latter (my suggestion) avoids adding new code / config settings. That said, please let me know if you think there's another reason to have this as an explicit setting in solrconfig.xml. Also, all the code in findCurrentHostAddress can simply be replaced by ZkController.getBaseUrl() when needed.
            sachingoyal Sachin Goyal added a comment -

            thelabdude, I do not feel very strongly about the configuration option in solrconfig.xml
            I kept it here only because specifying a global option looked simpler to use.

            I will try the ZkController.getBaseUrl() and update the patch shortly with the above suggestions.

            Thank you for reviewing.

            sachingoyal Sachin Goyal added a comment - thelabdude , I do not feel very strongly about the configuration option in solrconfig.xml I kept it here only because specifying a global option looked simpler to use. I will try the ZkController.getBaseUrl() and update the patch shortly with the above suggestions. Thank you for reviewing.
            thelabdude Timothy Potter added a comment -

            Awesome - btw ... in case you haven't seen this before, it's a little cumbersome to get at the ZkController from the req object, something like:

            req.getCore().getCoreDescriptor().getCoreContainer().getZkController().getBaseUrl()
            
            thelabdude Timothy Potter added a comment - Awesome - btw ... in case you haven't seen this before, it's a little cumbersome to get at the ZkController from the req object, something like: req.getCore().getCoreDescriptor().getCoreContainer().getZkController().getBaseUrl()
            sachingoyal Sachin Goyal added a comment -

            Oh great. Thanks for saving me a search for the ZkController

            sachingoyal Sachin Goyal added a comment - Oh great. Thanks for saving me a search for the ZkController
            sachingoyal Sachin Goyal added a comment -

            thelabdude, here is a new patch with the above comments incorporated.
            I have checked that

            req.getCore().getCoreDescriptor().getCoreContainer().getZkController().getBaseUrl()

            works well and so

            ResponseBuilder. findCurrentHostAddress()

            is no longer required.

            Configuration change is also done.

            sachingoyal Sachin Goyal added a comment - thelabdude , here is a new patch with the above comments incorporated. I have checked that req.getCore().getCoreDescriptor().getCoreContainer().getZkController().getBaseUrl() works well and so ResponseBuilder. findCurrentHostAddress() is no longer required. Configuration change is also done.
            thelabdude Timothy Potter added a comment -

            Thanks for the updated patch. Only thing we need now is a good unit test. I can take a stab at that over the next few days.

            thelabdude Timothy Potter added a comment - Thanks for the updated patch. Only thing we need now is a good unit test. I can take a stab at that over the next few days.
            sachingoyal Sachin Goyal added a comment -

            Thanks thelabdude.
            If you can point me to some existing test-case from where I can see the creation of multiple nodes' cluster and running updates/queries on the same, then I can help with unit test creation.

            sachingoyal Sachin Goyal added a comment - Thanks thelabdude . If you can point me to some existing test-case from where I can see the creation of multiple nodes' cluster and running updates/queries on the same, then I can help with unit test creation.
            sachingoyal Sachin Goyal added a comment -

            thelabdude, I have included a unit-test in the new patch.

            This tests that response is received from local cores only when `preferLocalShards` is set to true in the query.

            sachingoyal Sachin Goyal added a comment - thelabdude , I have included a unit-test in the new patch. This tests that response is received from local cores only when `preferLocalShards` is set to true in the query.
            thelabdude Timothy Potter added a comment -

            sachingoyal It seems like your latest patch was created / tested against branch4x vs. trunk? It's better to work against trunk for new features and then we'll back-port the changes as needed. I went ahead and migrated your patch to work with trunk and cleaned up a few places in the code. Overall looking good!

            thelabdude Timothy Potter added a comment - sachingoyal It seems like your latest patch was created / tested against branch4x vs. trunk? It's better to work against trunk for new features and then we'll back-port the changes as needed. I went ahead and migrated your patch to work with trunk and cleaned up a few places in the code. Overall looking good!
            thelabdude Timothy Potter added a comment -

            Also, I don't think we need to include this parameter in all of the configs, as we're trying to get away from bloated configs. So I changed the patch to just include in the sample techproducts configs. We'll also need to document this parameter in the Solr reference guide.

            thelabdude Timothy Potter added a comment - Also, I don't think we need to include this parameter in all of the configs, as we're trying to get away from bloated configs. So I changed the patch to just include in the sample techproducts configs. We'll also need to document this parameter in the Solr reference guide.
            sachingoyal Sachin Goyal added a comment -

            Thank you thelabdude.
            Please let me know how we can get this committed into the trunk and I can edit the Solr reference guide.
            I would also like to back-port this into the 5x branch.

            sachingoyal Sachin Goyal added a comment - Thank you thelabdude . Please let me know how we can get this committed into the trunk and I can edit the Solr reference guide. I would also like to back-port this into the 5x branch.
            thelabdude Timothy Potter added a comment -

            Working on committing this now.

            thelabdude Timothy Potter added a comment - Working on committing this now.

            Commit 1659748 from thelabdude in branch 'dev/trunk'
            [ https://svn.apache.org/r1659748 ]

            SOLR-6832: Queries be served locally rather than being forwarded to another replica

            jira-bot ASF subversion and git services added a comment - Commit 1659748 from thelabdude in branch 'dev/trunk' [ https://svn.apache.org/r1659748 ] SOLR-6832 : Queries be served locally rather than being forwarded to another replica

            Commit 1659750 from thelabdude in branch 'dev/branches/branch_5x'
            [ https://svn.apache.org/r1659750 ]

            SOLR-6832: Queries be served locally rather than being forwarded to another replica

            jira-bot ASF subversion and git services added a comment - Commit 1659750 from thelabdude in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1659750 ] SOLR-6832 : Queries be served locally rather than being forwarded to another replica
            sachingoyal Sachin Goyal added a comment -

            Thank you thelabdude!

            sachingoyal Sachin Goyal added a comment - Thank you thelabdude !

            The performance gain increases if coresPerMachine is > 1 and a single JVM has cores from 'k' shards.

            Ever managed to measure how much this feature helps in various scenarios?

            For a distributed query, the request is always sent to all the shards even if the originating SolrCore (handling the original distributed query) is a replica of one of the shards. If the original Solr-Core can check itself before sending http requests for any shard, we can probably save some network hopping and gain some performance.

            This sounds as like it saves only a N local calls out of M, where M > N, N is the number of local replicas that could be queried locally, and M is the total number of primary shards in the cluster that are to be queries. Is this correct?

            So say there are 20 shards spread evenly over 20 nodes (i.e., 1 shard per node) and a query request comes in, the node that got the request will query send 19 requests to the remaining 19 nodes and thus save just one network trip by querying a local shard? I must be missing something...

            otis Otis Gospodnetic added a comment - The performance gain increases if coresPerMachine is > 1 and a single JVM has cores from 'k' shards. Ever managed to measure how much this feature helps in various scenarios? For a distributed query, the request is always sent to all the shards even if the originating SolrCore (handling the original distributed query) is a replica of one of the shards. If the original Solr-Core can check itself before sending http requests for any shard, we can probably save some network hopping and gain some performance. This sounds as like it saves only a N local calls out of M, where M > N, N is the number of local replicas that could be queried locally, and M is the total number of primary shards in the cluster that are to be queries. Is this correct? So say there are 20 shards spread evenly over 20 nodes (i.e., 1 shard per node) and a query request comes in, the node that got the request will query send 19 requests to the remaining 19 nodes and thus save just one network trip by querying a local shard? I must be missing something...
            ayonsinha Ayon Sinha added a comment -

            @Otis, you are correct. This helps only where there is over-sharding. And in our particular scenario where we sharded to get better CPU core utilization and write speeds based on Tim's experiments with over-sharding. Since all queries were send to other nodes, we were getting hit with distributed deadlocks more often when one or more nodes were slow/overloaded.
            So this patch is a slight optimization and a reduction of likelihood of getting bogged down by other slow nodes when the parent query node has the core.

            ayonsinha Ayon Sinha added a comment - @Otis, you are correct. This helps only where there is over-sharding. And in our particular scenario where we sharded to get better CPU core utilization and write speeds based on Tim's experiments with over-sharding. Since all queries were send to other nodes, we were getting hit with distributed deadlocks more often when one or more nodes were slow/overloaded. So this patch is a slight optimization and a reduction of likelihood of getting bogged down by other slow nodes when the parent query node has the core.

            Hmmmm.... didn't examine the patch or tried this functionality, but based on your description... here are some comments.

            This helps only where there is over-sharding.

            That in itself should be avoided whenever possible in my experience. Overhead around memory and communication during querying. Could be related to your deadlocks. Or maybe you do a ton more writes so distributing writes across all nodes is worth the query-time overhead of over-sharding?

            Since all queries were send to other nodes, we were getting hit with distributed deadlocks more often when one or more nodes were slow/overloaded.

            Hmmmm... if that is truly happening, then isn't that a separate issue to be fixed?

            So this patch is a slight optimization and a reduction of likelihood of getting bogged down by other slow nodes when the parent query node has the core.

            But VERY slight, right? (hence my Q about whether you've quantified the improvement from this patch)

            Intuitively, querying the local data makes sense - why would one not do that if the data is right there. I just wonder how much you really benefit if you are saving just 1 (or very small) number of network calls in request that ends up dispatching NN requests to NN other nodes in the cluster.

            otis Otis Gospodnetic added a comment - Hmmmm.... didn't examine the patch or tried this functionality, but based on your description... here are some comments. This helps only where there is over-sharding. That in itself should be avoided whenever possible in my experience. Overhead around memory and communication during querying. Could be related to your deadlocks. Or maybe you do a ton more writes so distributing writes across all nodes is worth the query-time overhead of over-sharding? Since all queries were send to other nodes, we were getting hit with distributed deadlocks more often when one or more nodes were slow/overloaded. Hmmmm... if that is truly happening, then isn't that a separate issue to be fixed? So this patch is a slight optimization and a reduction of likelihood of getting bogged down by other slow nodes when the parent query node has the core. But VERY slight, right? (hence my Q about whether you've quantified the improvement from this patch) Intuitively, querying the local data makes sense - why would one not do that if the data is right there. I just wonder how much you really benefit if you are saving just 1 (or very small) number of network calls in request that ends up dispatching NN requests to NN other nodes in the cluster.
            ayonsinha Ayon Sinha added a comment -

            Actually, in our experience, network has been the most flaky piece. So any network hop saved is a big deal.

            And again you are right that the root cause (first domino) of the distributed deadlock is yet to be identified. What we see is when 1 machine in the cluster goes for a GC pause or traffic spike, it brings down all the other machines be quickly. The slow machine currently does not tell ZK that its struggling and hence all other nodes keep sending it queries. This is being addressed in another JIRA.

            This particular patch buys us some time.

            ayonsinha Ayon Sinha added a comment - Actually, in our experience, network has been the most flaky piece. So any network hop saved is a big deal. And again you are right that the root cause (first domino) of the distributed deadlock is yet to be identified. What we see is when 1 machine in the cluster goes for a GC pause or traffic spike, it brings down all the other machines be quickly. The slow machine currently does not tell ZK that its struggling and hence all other nodes keep sending it queries. This is being addressed in another JIRA. This particular patch buys us some time.
            thelabdude Timothy Potter added a comment -

            Bulk close after 5.1 release

            thelabdude Timothy Potter added a comment - Bulk close after 5.1 release
            sachingoyal Sachin Goyal added a comment -

            https://issues.apache.org/jira/browse/SOLR-7121 is a patch to address the other part of this problem.
            It helps nodes become aware of their slowness and tell the ZK that they should be moved out of the network for a while.
            When their health has recovered, the nodes automatically request the ZK to be joined back in the cluster.

            These two patches have resulted in making our cluster stable, though we have yet to quantify by how much (Quantification is not really a priority right now given that we will need to compare the cluster with an un-patched cluster and then put load on them to bring them down etc.)

            sachingoyal Sachin Goyal added a comment - https://issues.apache.org/jira/browse/SOLR-7121 is a patch to address the other part of this problem. It helps nodes become aware of their slowness and tell the ZK that they should be moved out of the network for a while. When their health has recovered, the nodes automatically request the ZK to be joined back in the cluster. These two patches have resulted in making our cluster stable, though we have yet to quantify by how much (Quantification is not really a priority right now given that we will need to compare the cluster with an un-patched cluster and then put load on them to bring them down etc.)

            People

              thelabdude Timothy Potter
              sachingoyal Sachin Goyal
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: