Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-8922

DocSetCollector can allocate massive garbage on large indexes

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      After reaching a point of diminishing returns tuning the GC collector, I decided to take a look at where the garbage was coming from. To my surprise, it turned out that for my index and query set, almost 60% of the garbage was coming from this single line:

      https://github.com/apache/lucene-solr/blob/94c04237cce44cac1e40e1b8b6ee6a6addc001a5/solr/core/src/java/org/apache/solr/search/DocSetCollector.java#L49

      This is due to the simple fact that I have 86M documents in my shards. Allocating a scratch array big enough to track a result set 1/64th of my index (1.3M) is also almost certainly excessive, considering my 99.9th percentile hit count is less than 56k.

        Attachments

        1. SOLR-8922.patch
          5 kB
          Yonik Seeley
        2. SOLR-8922.patch
          5 kB
          Jeff Wartes

          Issue Links

            Activity

              People

              • Assignee:
                yseeley@gmail.com Yonik Seeley
                Reporter:
                jwartes Jeff Wartes
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: