[SOLR-5611] When documents are uniformly distributed over shards, enable returning approximated results in distributed query - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Reopened
Priority: Minor
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: 4.9, 6.0
Component/s: SolrCloud
Labels:

Description

Query with rows=1000, which sent to a collection of 100 shards (shard key behaviour is default - based on hash of the unique key), will generate 100 requests of rows=1000, on each shard.
This results to total number of rows*numShards unique keys to be retrieved. This behaviour is getting worst as numShards grows.

If the documents are uniformly distributed over the shards, the expected number of document should be ~ rows/numShards. Obviously, there might be extreme cases, when all of the top X documents are in a specific shard.

I suggest adding an optional parameter, say approxResults=true, which decides whether we should limit the rows in the shard requests to rows/numShardsor not. Moreover, we can add a numeric parameter which increases the limit, to be more accurate.
For example, the query approxResults=true&approxResults.factor=1.5 will retrieve 1.5*rows/numShards from each shard. In the case of 100 shards and rows=1000, each shard will return 15 documents.

Furthermore, this can reduce the problem of deep paging, because the same thing can be applied there. when requested start=100000, Solr creating shard request with start=0 and rows=START+ROWS. In the approximated approach, start parameter (in the shard requests) can be set to 100000/numShards. The idea of the approxResults.factor creates some difficulties here, though.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

lec5-distributedIndexing.pdf
18/Nov/14 13:31
196 kB
Manuel Lenormand

Issue Links

relates to

SOLR-6810 Faster searching limited but high rows across many shards all with many hits

Open

Activity

People

Assignee:: Unassigned

Reporter:: Isaac Hebsh

Votes:: 2 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 06/Jan/14 21:01

Updated:: 24/May/23 06:00