Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-17419

Improve HttpShardHandler performance in many-shard collections

    XMLWordPrintableJSON

Details

    Description

      In Solr 8, HttpShardHandler sends shard-requests by submitting Callables to an ExecutorService. As a result, both the "request-sending" and "response-awaiting" happened asynchronous to the original request-thread.

        @Override
        public void submit(final ShardRequest sreq, final String shard, final ModifiableSolrParams params) {
          ShardRequestor shardRequestor = new ShardRequestor(sreq, shard, params, this); // Callable
          try {
            shardRequestor.init();
            pending.add(completionService.submit(shardRequestor));
          } finally {
            shardRequestor.end();
          }   
        }
      

      However, in Solr 9.x HttpShardHandler ditched the ExecutorService/per-request-thread approach in favor of sending all requests serially using "SolrClient.requestAsync". SOLR-14354, which made this change, did this in an effort to avoid unnecessary thread and CPU context-switching. As Dat described in SOLR-14354:

      after sending a request that thread basically do nothing just waiting for response from other side. That thread will be swapped out and CPU will try to handle another thread (this is called context switch, CPU will save the context of the current thread and switch to another one). When some data (not all) come back, that thread will be called to parsing these data, then it will wait until more data come back. So there will be lots of context switching in CPU. That is quite inefficient

      This approach comes with a downside though - all the shard requests are sent serially. If sending each request takes ~1ms, then a user is unlikely to notice this in their collection with 5 or 10 shards.  But the cost here scales linearly, so in a collection with 50 shards - this approach would bake a ~50ms delay into the critical path of every single query!

      This issue is intended to reevaluate whether there's a better way to balance these concerns. Ideally we can come up with an approach that improves all scenarios. Lacking that, maybe Solr could choose between one of several approaches semi-intelligently based on the number of shards or other factors?

      Attachments

        1. shardhandler-perf-graph.png
          212 kB
          Jason Gerlowski

        Issue Links

          Activity

            People

              gerlowskija Jason Gerlowski
              gerlowskija Jason Gerlowski
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 0.5h
                  0.5h