Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-3129

SolrQueryIndex making too many Solr requests per jCR query

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.17, 1.2.2, 1.3.2
    • Fix Version/s: 1.0.18, 1.2.4, 1.3.3, 1.4
    • Component/s: solr
    • Labels:
      None

      Description

      SolrQueryIndex and FilterQueryParser use the OakSolrConfiguration#getRows setting in order to set the number of documents that should be fetched in batches while iterating the Cursor resulting from a certain query.
      While this is an optimization that avoids loading all the results in memory in cases where only e.g. the first 10 results of the Cursor are visited, it tends to perform really bad when resultsets' cardinality is 10 times or more bigger than the 'rows' setting, because for each JCR query, 10 or more Solr queries are performed (with the additional network, Solr calls, etc. latencies).

      In order to avoid that we could make use of the 'rows' setting in order to perform the first request to Solr and then adapt the subsequent paged requests (controlled by start and rows Solr HTTP parameters) to be run against the rest of the resultset in no more than 2 Solr queries. This can be done by looking at the numFound value from Solr's response header (from the first query) and set the start/rows parameters accordingly.

        Attachments

          Activity

            People

            • Assignee:
              teofili Tommaso Teofili
              Reporter:
              teofili Tommaso Teofili
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: