[SOLR-5463] Provide cursor/token based "searchAfter" support that works with arbitrary sorting (ie: "deep paging") - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 4.7, 6.0
Component/s: None
Labels:
None

Description

I'd like to revist a solution to the problem of "deep paging" in Solr, leveraging an HTTP based API similar to how IndexSearcher.searchAfter works at the lucene level: require the clients to provide back a token indicating the sort values of the last document seen on the previous "page". This is similar to the "cursor" model I've seen in several other REST APIs that support "pagnation" over a large sets of results (notable the twitter API and it's "since_id" param) except that we'll want something that works with arbitrary multi-level sort critera that can be either ascending or descending.

~~SOLR-1726~~ laid some initial ground work here and was commited quite a while ago, but the key bit of argument parsing to leverage it was commented out due to some problems (see comments in that issue). It's also somewhat out of date at this point: at the time it was commited, IndexSearcher only supported searchAfter for simple scores, not arbitrary field sorts; and the params added in ~~SOLR-1726~~ suffer from this limitation as well.

—

I think it would make sense to start fresh with a new issue with a focus on ensuring that we have deep paging which:

supports arbitrary field sorts in addition to sorting by score
works in distributed mode

Basic Usage

send a request with sort=X&start=0&rows=N&cursorMark=*
- sort can be anything, but must include the uniqueKey field (as a tie breaker)
- "N" can be any number you want per page
- start must be "0"
- "*" denotes you want to use a cursor starting at the beginning mark
parse the response body and extract the (String) nextCursorMark value
Replace the "*" value in your initial request params with the nextCursorMark value from the response in the subsequent request
repeat until the nextCursorMark value stops changing, or you have collected as many docs as you need

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

SOLR-5463-randomized-faceting-test.patch
13/Jan/14 17:52
8 kB
Steven Rowe
SOLR-5463.patch
13/Dec/13 23:35
110 kB
Chris M. Hostetter
SOLR-5463.patch
17/Dec/13 02:08
115 kB
Chris M. Hostetter
SOLR-5463.patch
18/Dec/13 00:20
111 kB
Chris M. Hostetter
SOLR-5463.patch
31/Dec/13 20:24
114 kB
Steven Rowe
SOLR-5463.patch
01/Jan/14 00:34
118 kB
Chris M. Hostetter
SOLR-5463.patch
03/Jan/14 00:31
126 kB
Chris M. Hostetter
SOLR-5463.patch
04/Jan/14 02:27
132 kB
Chris M. Hostetter
SOLR-5463__straw_man.patch
22/Nov/13 16:31
39 kB
Chris M. Hostetter
SOLR-5463__straw_man.patch
27/Nov/13 01:39
58 kB
Chris M. Hostetter
SOLR-5463__straw_man.patch
28/Nov/13 00:13
64 kB
Chris M. Hostetter
SOLR-5463__straw_man.patch
03/Dec/13 23:14
70 kB
Chris M. Hostetter
SOLR-5463__straw_man.patch
04/Dec/13 01:58
70 kB
Chris M. Hostetter
SOLR-5463__straw_man.patch
06/Dec/13 18:43
94 kB
Chris M. Hostetter
SOLR-5463__straw_man.patch
06/Dec/13 22:43
94 kB
Chris M. Hostetter
SOLR-5463__straw_man.patch
07/Dec/13 00:17
94 kB
Chris M. Hostetter
SOLR-5463__straw_man.patch
09/Dec/13 23:20
97 kB
Chris M. Hostetter
SOLR-5463__straw_man.patch
12/Dec/13 23:50
99 kB
Chris M. Hostetter
SOLR-5463__straw_man__MissingStringLastComparatorSource.patch
16/Dec/13 20:18
3 kB
Steven Rowe

Issue Links

is blocked by

SOLR-5354 Distributed sort is broken with CUSTOM FieldType

Closed

is related to

SOLR-5652 Heisenbug in DistribCursorPagingTest: "walk already seen ..."

Closed

SOLR-5671 Heisenbug #2 in DistribCursorPagingTest: full walk returns one fewer doc than expected

Closed

SOLR-5595 Distributed Sort: potential performance improvements & code readabiliity

Open

LUCENE-5380 PagingFieldCollector should track previous page hits

Open

relates to

SOLR-6121 cursorMark should accept sort param without explicit uniqueKey, do implicit uniqueKey tie breaker sort under the hood

Open

supercedes

SOLR-1726 Deep Paging and Large Results Improvements

Resolved

(1 relates to, 1 supercedes)

Sub-Tasks

User Guide docs on using cursors & deep paging

Closed

Chris M. Hostetter

Activity

People

Assignee:: Chris M. Hostetter

Reporter:: Chris M. Hostetter

Votes:: 12 Vote for this issue

Watchers:: 21 Start watching this issue

Dates

Created:: 19/Nov/13 01:32

Updated:: 08/Oct/19 14:51

Resolved:: 10/Jan/14 17:01