[LUCENE-1997] Explore performance of multi-PQ vs single-PQ sorting API - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Reopened
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.9
Fix Version/s: None
Component/s: core/search
Labels:
None

Lucene Fields:

New

Description

Spinoff from recent "lucene 2.9 sorting algorithm" thread on java-dev,
where a simpler (non-segment-based) comparator API is proposed that
gathers results into multiple PQs (one per segment) and then merges
them in the end.

I started from John's multi-PQ code and worked it into
contrib/benchmark so that we could run perf tests. Then I generified
the Python script I use for running search benchmarks (in
contrib/benchmark/sortBench.py).

The script first creates indexes with 1M docs (based on
SortableSingleDocSource, and based on wikipedia, if available). Then
it runs various combinations:

Index with 20 balanced segments vs index with the "normal" log
segment size

Queries with different numbers of hits (only for wikipedia index)

Different top N

Different sorts (by title, for wikipedia, and by random string,
random int, and country for the random index)

For each test, 7 search rounds are run and the best QPS is kept. The
script runs singlePQ then multiPQ, and records the resulting best QPS
for each and produces table (in Jira format) as output.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-1997.patch
20/Oct/09 15:05
39 kB
Michael McCandless
LUCENE-1997.patch
20/Oct/09 16:41
40 kB
Michael McCandless
LUCENE-1997.patch
23/Oct/09 16:06
45 kB
Michael McCandless
LUCENE-1997.patch
23/Oct/09 16:08
45 kB
Michael McCandless
LUCENE-1997.patch
27/Oct/09 09:34
46 kB
Michael McCandless
LUCENE-1997.patch
27/Oct/09 10:26
46 kB
Michael McCandless
LUCENE-1997.patch
27/Oct/09 19:46
48 kB
Michael McCandless
LUCENE-1997.patch
28/Oct/09 10:09
49 kB
Michael McCandless
LUCENE-1997.patch
30/Oct/09 10:37
59 kB
Michael McCandless

Activity

People

Assignee:: Michael McCandless

Reporter:: Michael McCandless

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 20/Oct/09 15:02

Updated:: 28/Aug/22 12:12