Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-4227

StreamingUpdateSolrServer does not buffer OutputStreamWriter with BufferedWriter, causing encoding explosion

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 3.2
    • 4.7, 6.0
    • None
    • None
    • Java 1.6, Linux. I am running SOLR 3.2, but the code doesn't seem different in 3.5.

    Description

      org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer line 112 is:
      OutputStreamWriter writer = new OutputStreamWriter(out, "UTF-8");
      and then we call
      req.writeXML( writer );
      Because the writer is not buffered, this causes the XML writer to call the UTF-8 encoder for each atom being written, like in org.apache.solr.common.util.XML.writeXML:
      out.write('<');
      This causes the stream encoder to allocate a char array to hold it, and
      sun.nio.cs.StreamEncoder.implWrite allocates a CharBuffer to wrap it. All just for one character.

      This is particularly a problem when you have a lot of threads (100?) writing to the SOLR server, they rapidly eat up all the CPU.

      It would be helpful to allocate the writer as a BufferedWriter, so encoding only happens when you flush. JavaDoc for OutputStreamWriter recommends this: "For top efficiency, consider wrapping an OutputStreamWriter within a BufferedWriter so as to avoid frequent converter invocations."

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            shalin Shalin Shekhar Mangar
            ckherrmann Conrad Herrmann
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment