Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7580

Number of ScoreDoc instances equals rows parameter, not actual number of matches

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 5.1
    • Fix Version/s: 5.5, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      We have several batch jobs that use StreamingResponseCallback to collect all records matching a specific query. For each record, we execute a new query and need all results without paging through them. Because we do not know the amount of matches to expect, we do setRows(Integer.MAX_VALUE);. According to the VisualVM samples, this results in a huge amount of ScoreDoc instances, making the query unreasonably slow.

      The current work-around we use is to execute the same query with setRows(0), get numResults, and then reissue the query with setRows(numResults). This is fast, almost as fast as one would expect.

      This is, however, a very dirty work-around. I am unsure whether this is a Solr or Lucene issue, SolrIndexSearcher is a beast to debug

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              markus17 Markus Jelsma
            • Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated: