Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-14249

Krb5HttpClientBuilder should not buffer requests

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 7.4, 8.4.1, main (9.0)
    • Fix Version/s: None
    • Component/s: Authentication, SolrJ
    • Labels:
      None

      Description

      When SolrJ clients enable Kerberos authentication, a request interceptor is set up which wraps the actual HttpEntity in a BufferedHttpEntity. This BufferedHttpEntity, well, buffers the request body in a byte[] so it can be repeated if needed. This works fine for small requests, but when requests get large storing the entire request in memory causes contention or OutOfMemoryErrors.

      The easiest way for this to manifest is to use ConcurrentUpdateSolrClient, which opens a connection to Solr and streams documents out in an ever increasing request entity until the doc queue held by the client is emptied.

      I ran into this while troubleshooting a DIH run that would reproducibly load a few hundred thousand documents before progress stalled out. Solr never crashed and the DIH thread was still alive, but the ConcurrentUpdateSolrClient used by DIH had its "Runner" thread disappear around the time of the stall and an OOM like the one below could be seen in solr-8983-console.log:

      WARNING: Uncaught exception in thread: Thread[concurrentUpdateScheduler-28-thread-1,5,TGRP-TestKerberosClientBuffering]
      java.lang.OutOfMemoryError: Java heap space
        at __randomizedtesting.SeedInfo.seed([371A00FBA76D31DF]:0)
        at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
        at java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120)
        at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)
        at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:156)
        at org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:213)
        at org.apache.solr.common.util.FastOutputStream.write(FastOutputStream.java:94)
        at org.apache.solr.common.util.ByteUtils.writeUTF16toUTF8(ByteUtils.java:145)
        at org.apache.solr.common.util.JavaBinCodec.writeStr(JavaBinCodec.java:848)
        at org.apache.solr.common.util.JavaBinCodec.writePrimitive(JavaBinCodec.java:932)
        at org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:328)
        at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
        at org.apache.solr.common.util.JavaBinCodec.writeSolrInputDocument(JavaBinCodec.java:616)
        at org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:355)
        at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
        at org.apache.solr.common.util.JavaBinCodec.writeMapEntry(JavaBinCodec.java:764)
        at org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:383)
        at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
        at org.apache.solr.common.util.JavaBinCodec.writeIterator(JavaBinCodec.java:705)
        at org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:367)
        at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
        at org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:223)
        at org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:330)
        at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:228)
        at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:155)
        at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.marshal(JavaBinUpdateRequestCodec.java:91)
        at org.apache.solr.client.solrj.impl.BinaryRequestWriter.write(BinaryRequestWriter.java:83)
        at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner$1.writeTo(ConcurrentUpdateSolrClient.java:264)
        at org.apache.http.entity.EntityTemplate.writeTo(EntityTemplate.java:73)
        at org.apache.http.entity.BufferedHttpEntity.<init>(BufferedHttpEntity.java:62)
        at org.apache.solr.client.solrj.impl.Krb5HttpClientBuilder.lambda$new$3(Krb5HttpClientBuilder.java:155)
        at org.apache.solr.client.solrj.impl.Krb5HttpClientBuilder$$Lambda$459/0x0000000800623840.process(Unknown Source)
        at org.apache.solr.client.solrj.impl.HttpClientUtil$DynamicInterceptor$1.accept(HttpClientUtil.java:177)
      

      We took heap dumps and were able to confirm that the entire 8gb heap was taken up with a single massive CUSC request body that was being buffered!

      (As an aside, I had no idea that OutOfMemoryError's could happen without killing the entire JVM. But apparently they can. CUSC.Runner propagates the OOM as it should and the OOM kills the Runner thread. Since that thread is the gc-root for the massive BufferedHttpEntity though, a garbage collection frees up most of the heap space and the JVM survives its memory trouble. Solr's oom script never triggers.)

      I've attached a JUnit test which reproduces the OOM issue by using a "fake" Kerberos config.

        Attachments

        1. SOLR-14249-reproduction.patch
          174 kB
          Jason Gerlowski

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                gerlowskija Jason Gerlowski
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: