[LUCENE-4602] Use DocValues to store per-doc facet ord - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 4.2, 6.0
Component/s: modules/facet
Labels:
None

Lucene Fields:

New, Patch Available

Description

Spinoff from ~~LUCENE-4600~~

DocValues can be used to hold the byte[] encoding all facet ords for
the document, instead of payloads. I made a hacked up approximation
of in-RAM DV (see CachedCountingFacetsCollector in the patch) and the
gains were somewhat surprisingly large:

                    Task    QPS base      StdDev    QPS comp      StdDev                Pct diff
                HighTerm        0.53      (0.9%)        1.00      (2.5%)   87.3% (  83% -   91%)
                 LowTerm        7.59      (0.6%)       26.75     (12.9%)  252.6% ( 237% -  267%)
                 MedTerm        3.35      (0.7%)       12.71      (9.0%)  279.8% ( 268% -  291%)

I didn't think payloads were THAT slow; I think it must be the advance
implementation?

We need to separately test on-disk DV to make sure it's at least
on-par with payloads (but hopefully faster) and if so ... we should
cutover facets to using DV.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

TestFacetsPayloadMigrationReader.java
15/Jan/13 15:41
17 kB
Shai Erera
LUCENE-4602.patch
09/Dec/12 16:19
24 kB
Michael McCandless
LUCENE-4602.patch
13/Dec/12 00:12
20 kB
Michael McCandless
LUCENE-4602.patch
15/Jan/13 15:34
99 kB
Shai Erera
LUCENE-4602.patch
15/Jan/13 15:41
102 kB
Shai Erera
LUCENE-4602.patch
15/Jan/13 20:52
159 kB
Shai Erera
FacetsPayloadMigrationReader.java
15/Jan/13 15:39
7 kB
Shai Erera

Issue Links

contains

LUCENE-4623 facets should index drill-down fields using DOCS_ONLY

Closed

Activity

People

Assignee:: Shai Erera

Reporter:: Michael McCandless

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 09/Dec/12 16:14

Updated:: 28/Aug/22 13:33

Resolved:: 16/Jan/13 10:10