Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-16265

reduce memory usage of ContentWriter based requests in Http2SolrClient

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 9.4
    • None
    • None

    Description

      I recently noticed the code below exists in Http2SolrClient.createRequest...

      if (contentWriter != null) {
        Request req = httpClient.newRequest(url + wparams.toQueryString()).method(method);
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        contentWriter.write(baos);
      
        // TODO reduce memory usage
        return req.content(
            new BytesContentProvider(contentWriter.getContentType(), baos.toByteArray()));
      
      • AFAICT there is no (other) existing jira discussing this TODO
      • This method is called for most "simple" HTTP2 based requests
        • Http2SolrClient or CloudHttp2SolrClient – but not ConcurrentUpdateHttp2SolrClient
      • This block triggers for anything with a ContentWriter
        • ie: all UpdateRequests ... and in theory other custom requests
      • Part of the issue seems to be that this code repurposes the ContentWriter "push" style API into a "pull" style Jetty client API
        • Even though Http2SolrClient has other code used only by ConcurrentUpdateHttp2SolrClient (initOutStream(...)) which does leverage a "push" style Jetty client API: OutputStreamContentProvider
      • But more silly: we make one (serialized) byte[] of the data in memory inside the ByteArrayOutputStream then we call toByteArray() which makes a second copy of the byte[].

      Attachments

        Activity

          People

            stillalex Alex Deparvu
            hossman Chris M. Hostetter
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3h
                3h