[SOLR-5894] Speed up high-cardinality facets with sparse counters - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: 4.7.1
Fix Version/s: None
Component/s: SearchComponents - other
Labels:

Description

Multiple performance enhancements to Solr String faceting.

Sparse counters, switching the constant time overhead of extracting top-X terms with time overhead linear to result set size
Counter re-use for reduced garbage collection and lower per-call overhead
Optional counter packing, trading speed for space
Improved distribution count logic, greatly improving the performance of distributed faceting
In-segment threaded faceting
Regexp based white- and black-listing of facet terms
Heuristic faceting for large result sets

Currently implemented for Solr 4.10. Source, detailed description and directly usable WAR at http://tokee.github.io/lucene-solr/

This project has grown beyond a simple patch and will require a fair amount of co-operation with a committer to get into Solr. Splitting into smaller issues is a possibility.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

author_7M_tags_1852_logged_queries_warmed.png
21/Mar/14 13:54
10 kB
Toke Eskildsen
SOLR-5894_test.zip
11/Apr/14 11:18
53 kB
Toke Eskildsen
SOLR-5894_test.zip
03/Apr/14 13:16
52 kB
Toke Eskildsen
SOLR-5894_test.zip
31/Mar/14 13:41
48 kB
Toke Eskildsen
SOLR-5894_test.zip
28/Mar/14 15:01
45 kB
Toke Eskildsen
SOLR-5894_test.zip
28/Mar/14 11:47
45 kB
Toke Eskildsen
SOLR-5894.patch
03/Jul/14 08:10
103 kB
Toke Eskildsen
SOLR-5894.patch
18/Jun/14 12:20
102 kB
Toke Eskildsen
SOLR-5894.patch
11/Apr/14 11:18
97 kB
Toke Eskildsen
SOLR-5894.patch
03/Apr/14 13:16
108 kB
Toke Eskildsen
SOLR-5894.patch
31/Mar/14 13:41
97 kB
Toke Eskildsen
SOLR-5894.patch
28/Mar/14 15:01
75 kB
Toke Eskildsen
SOLR-5894.patch
28/Mar/14 11:47
72 kB
Toke Eskildsen
SOLR-5894.patch
27/Mar/14 13:00
17 kB
Toke Eskildsen
SOLR-5894.patch
21/Mar/14 12:59
15 kB
Toke Eskildsen
sparse_2000000docs_fc_cutoff_20140403-145412.png
03/Apr/14 13:16
11 kB
Toke Eskildsen
sparse_5000000docs_20140331-151918_multi.png
31/Mar/14 13:41
13 kB
Toke Eskildsen
sparse_5000000docs_20140331-151918_single.png
31/Mar/14 13:41
13 kB
Toke Eskildsen
sparse_50510000docs_20140328-152807.png
28/Mar/14 15:01
13 kB
Toke Eskildsen

Issue Links

relates to

SOLR-13807 Caching for term facet counts

Open

Activity

People

Assignee:: Toke Eskildsen

Reporter:: Toke Eskildsen

Votes:: 14 Vote for this issue

Watchers:: 32 Start watching this issue

Dates

Created:: 21/Mar/14 12:36

Updated:: 17/Jun/20 14:25