Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-9125

CollapseQParserPlugin allocations are index based, not query based

    XMLWordPrintableJSON

Details

    Description

      Among other things, CollapsingQParserPlugin’s OrdScoreCollector allocates space per-query for:
      1 int (doc id) per ordinal
      1 float (score) per ordinal
      1 bit (FixedBitSet) per document in the index

      So the higher the cardinality of the thing you’re grouping on, and the more documents in the index, the more memory gets consumed per query. Since high cardinality and large indexes are the use-cases CollapseQParserPlugin was designed for, I thought I'd point this out.

      My real issue is that this does not vary based on the number of results in the query, either before or after collapsing, so a query that results in one doc consumes the same amount of memory as one that returns all of them. All of the Collectors suffer from this to some degree, but I think OrdScore is the worst offender.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jwartes Jeff Wartes
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: