[SOLR-8559] FCS facet performance optimization - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Implemented
Affects Version/s: 5.5, 6.0
Fix Version/s: 5.5
Component/s: faceting
Labels:
- optimization
- performance

Description

While profiling a large collection (multi-sharded billions of documents), I found that a fast (5-10ms query) which had no matches would take 20-30 seconds when doing facets even when facet.mincount=1

Profiling made it apparent that with facet.method=fcs 99% of the time was spent here.

queue.udpateTop gets called numOfSegments*numTerms, the worst case when every term is in every segment. This formula doesn't take into account whether or not any of the terms have a positive count with respect to the docset.

These optimizations are aimed to do two things:

When mincount>0 don't include segments which all terms have zero counts. This should significantly speed up processing when terms are high cardinality and the matching docset is small
FIXED TODO optimization: when mincount>0 move segment position the next non zero term value.

both of these changes will minimize the number of called needed to the slow updateTop call.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

solr-8559.patch
16/Jan/16 23:51
3 kB
Keith Laban
SOLR-8559.patch
19/Jan/16 22:43
3 kB
Dennis Gove
SOLR-8559-4-10-4.patch
19/Jan/16 22:49
3 kB
Keith Laban

Activity

People

Assignee:: Dennis Gove

Reporter:: Keith Laban

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 16/Jan/16 23:49

Updated:: 09/May/16 18:45

Resolved:: 22/Jan/16 21:43