Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-14354

HttpShardHandler send requests in async



    • Type: Improvement
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: master (9.0)
    • Component/s: None
    • Labels:


      1. Current approach (problem) of Solr

      Below is the diagram describe the model on how currently handling a request.

      The main-thread that handles the search requests, will submit n requests (n equals to number of shards) to an executor. So each request will correspond to a thread, after sending a request that thread basically do nothing just waiting for response from other side. That thread will be swapped out and CPU will try to handle another thread (this is called context switch, CPU will save the context of the current thread and switch to another one). When some data (not all) come back, that thread will be called to parsing these data, then it will wait until more data come back. So there will be lots of context switching in CPU. That is quite inefficient on using threads.Basically we want less threads and most of them must busy all the time, because threads are not free as well as context switching. That is the main idea behind everything, like executor

      2. Async call of Jetty HttpClient

      Jetty HttpClient offers async API like this.

              // Add request hooks
              .onRequestQueued(request -> { ... })
              .onRequestBegin(request -> { ... })
              // Add response hooks
              .onResponseBegin(response -> { ... })
              .onResponseHeaders(response -> { ... })
              .onResponseContent((response, buffer) -> { ... })
              .send(result -> { ... }); 

      Therefore after calling send() the thread will return immediately without any block. Then when the client received the header from other side, it will call onHeaders() listeners. When the client received some byte[] (not all response) from the data it will call onContent(buffer) listeners. When everything finished it will call onComplete listeners. One main thing that will must notice here is all listeners should finish quick, if the listener block, all further data of that request won’t be handled until the listener finish.

      3. Solution 1: Sending requests async but spin one thread per response

      Jetty HttpClient already provides several listeners, one of them is InputStreamResponseListener. This is how it is get used

      InputStreamResponseListener listener = new InputStreamResponseListener();
      // Wait for the response headers to arrive
      Response response = listener.get(5, TimeUnit.SECONDS);
      if (response.getStatus() == 200) {
        // Obtain the input stream on the response content
        try (InputStream input = listener.getInputStream()) {
          // Read the response content

      In this case, there will be 2 thread

      • one thread trying to read the response content from InputStream
      • one thread (this is a short-live task) feeding content to above InputStream whenever some byte[] is available. Note that if this thread unable to feed data into InputStream, this thread will wait.

      By using this one, the model of HttpShardHandler can be written into something like this

      handler.sendReq(req, (is) -> {
        executor.submit(() ->
          try (is) {
            // Read the content from InputStream

      The first diagram will be changed into this

      Notice that although “sending req to shard1” is wide, it won’t take long time since sending req is a very quick operation. With this operation, handling threads won’t be spin up until first bytes are sent back. Notice that in this approach we still have active threads waiting for more data from InputStream

      4. Solution 2: Buffering data and handle it inside jetty’s thread.

      Jetty have another listener called BufferingResponseListener. This is how it is get used

      client.newRequest(...).send(new BufferingResponseListener() {
        public void onComplete(Result result) {
          try {
            byte[] response = getContent();
            //handling response

      On receiving data, Jetty (one of its thread) will call the listener with the given data (data here is just byte[] represent part of the response). The listener will then buffer that byte[] into an internal buffer. When all the data are received, Jetty will call onComplete of the listener and inside that method we will get all the response.

      By using this one, the model of HttpShardHandler can be written into something like this

      handle.send(req, (byte[]) -> {
        // handling data here

      The first diagram will be changed into this


      • We don’t need additional threads for each request → Less threads
      • No thread are activately waiting for data from an InputStream → Threads are more busy


      • Data must be buffered all before able being to parse → double memory being used for parsing a response.

      5. Solution 3: Why not both?

      Solution 1 is good for parsing very large response or sometimes unbounded (like in StreamingExpression) response.

      Solution 2 is good for parsing small response (may be < 10KB) since overhead is little.

      Should we combine both solutions above? After all what is returned by HttpSolrClient so far for all requests is a NamedList<>, so as long as we can return a NamedList<> using Solution 1 or Solution 2 are not matter with users.

      Therefore the idea here is based on “CONTENT_LENGTH” of the response’s headers. If the response body less than a certain size we will go with solution 2 and vice versa.

      Note: Solr seems doesn’t return content-length accurately, need more investigation.

      6. Further improvement

      The best approach to solve this problem is instead of converting InputStream to NamedList, why don’t we just converting byte by byte and make it resumable. Like this

      Parser parser = new Parser();
      public void onContent(ByteBuffer buffer) {
      public void onComplete() {
        NamedList<> result = parser.getResult();

      Therefore, there will be no blocking operation inside parser, thus making a very efficient model. But doing this requires tons of change in Solr, rewrite all ResponseParsers in Solr, not mention the flow here must be rewritten. Not sure it is worth it for doing that.


        1. image-2020-10-23-16-45-37-628.png
          435 kB
          Varun Thacker
        2. image-2020-10-23-16-45-21-789.png
          435 kB
          Varun Thacker
        3. image-2020-10-23-16-45-20-034.png
          435 kB
          Varun Thacker
        4. image-2020-03-23-10-12-00-661.png
          21 kB
          Cao Manh Dat
        5. image-2020-03-23-10-09-10-221.png
          33 kB
          Cao Manh Dat
        6. image-2020-03-23-10-04-08-399.png
          20 kB
          Cao Manh Dat

          Issue Links



              • Assignee:
                caomanhdat Cao Manh Dat
                caomanhdat Cao Manh Dat
              • Votes:
                0 Vote for this issue
                14 Start watching this issue


                • Created:

                  Time Tracking

                  Original Estimate - Not Specified
                  Not Specified
                  Remaining Estimate - 0h
                  Time Spent - 4.5h