[LUCENE-2329] Use parallel arrays instead of PostingList objects - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.1, 4.0-ALPHA
Component/s: core/index
Labels:
None

Lucene Fields:

New

Description

This is Mike's idea that was discussed in ~~LUCENE-2293~~ and ~~LUCENE-2324~~.

In order to avoid having very many long-living PostingList objects in TermsHashPerField we want to switch to parallel arrays. The termsHash will simply be a int[] which maps each term to dense termIDs.

All data that the PostingList classes currently hold will then we placed in parallel arrays, where the termID is the index into the arrays. This will avoid the need for object pooling, will remove the overhead of object initialization and garbage collection. Especially garbage collection should benefit significantly when the JVM runs out of memory, because in such a situation the gc mark times can get very long if there is a big number of long-living objects in memory.

Another benefit could be to build more efficient TermVectors. We could avoid the need of having to store the term string per document in the TermVector. Instead we could just store the segment-wide termIDs. This would reduce the size and also make it easier to implement efficient algorithms that use TermVectors, because no term mapping across documents in a segment would be necessary. Though this improvement we can make with a separate jira issue.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

lucene-2329.patch
22/Mar/10 16:04
48 kB
Michael Busch
lucene-2329.patch
22/Mar/10 08:56
49 kB
Michael Busch
lucene-2329.patch
22/Mar/10 08:05
48 kB
Michael Busch
LUCENE-2329.patch
06/Apr/10 09:29
9 kB
Michael McCandless
LUCENE-2329.patch
05/Apr/10 19:21
9 kB
Michael McCandless
LUCENE-2329.patch
01/Apr/10 23:49
21 kB
Michael McCandless
LUCENE-2329.patch
01/Apr/10 17:58
11 kB
Michael McCandless
lucene-2329-2.patch
01/Apr/10 00:32
11 kB
Michael Busch

Activity

People

Assignee:: Michael Busch

Reporter:: Michael Busch

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 18/Mar/10 05:57

Updated:: 28/Aug/22 12:22

Resolved:: 04/May/10 17:18