Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-10375

Stored text retrieved via StoredFieldVisitor on doc in the document cache over-estimates needed byte[]

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • Java 1.8.121, Linux x64

    Description

      Using SolrIndexSearcher.doc(int n, StoredFieldVisitor visitor) (as can happen with the UnifiedHighlighter in particular)

      If the document cache has the document, will call visitFromCached, will get an out of memory error because of line 752 of SolrIndexSearcher - visitor.stringField(info, f.stringValue().getBytes(StandardCharsets.UTF_8));

       at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)
        at java.lang.StringCoding.encode(Ljava/nio/charset/Charset;[CII)[B (StringCoding.java:350)
        at java.lang.String.getBytes(Ljava/nio/charset/Charset;)[B (String.java:941)
        at org.apache.solr.search.SolrIndexSearcher.visitFromCached(Lorg/apache/lucene/document/Document;Lorg/apache/lucene/index/StoredFieldVisitor;)V (SolrIndexSearcher.java:685)
        at org.apache.solr.search.SolrIndexSearcher.doc(ILorg/apache/lucene/index/StoredFieldVisitor;)V (SolrIndexSearcher.java:652)
      

      This is due to the current String.getBytes(Charset) implementation, which allocates the underlying byte array as a function of charArrayLength*maxBytesPerCharacter, which for UTF-8 is 3. 3 * 716MB is over Integer.MAX, and the JVM cannot allocate over this, so an out of memory exception is thrown because the allocation of this much memory for a single array is currently impossible.

      The problem is not present when the document cache is disabled.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              mbraun688 Michael Braun
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: