[SOLR-2218] Performance of start= and rows= parameters are exponentially slow with large data sets - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 1.4.1
Fix Version/s: None
Component/s: Build
Labels:
None

Description

With large data sets, > 10M rows.

Setting start=<large number> and rows=<large numbers> is slow, and gets slower the farther you get from start=0 with a complex query. Random also makes this slower.

Would like to somehow make this performance faster for looping through large data sets. It would be nice if we could pass a pointer to the result set to loop, or support very large rows=<number>.

Something like:
rows=1000
start=0
spointer=string_my_query_1

Then within interval (like 5 mins) I can reference this loop:
Something like:
rows=1000
start=1000
spointer=string_my_query_1

What do you think? Since the data is too great the cache is not helping.

Attachments

Issue Links

duplicates

SOLR-1726 Deep Paging and Large Results Improvements

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Bill Bell

Votes:: 2 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 03/Nov/10 02:24

Updated:: 06/Oct/11 13:54

Resolved:: 06/Oct/11 13:54