[LUCENE-2455] Some house cleaning in addIndexes* - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Trivial
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.1, 4.0-ALPHA
Component/s: core/index
Labels:
None

Lucene Fields:

New, Patch Available

Description

Today, the use of addIndexes and addIndexesNoOptimize is confusing -
especially on when to invoke each. Also, addIndexes calls optimize() in
the beginning, but only on the target index. It also includes the
following jdoc statement, which from how I understand the code, is
wrong: After this completes, the index is optimized. – optimize() is
called in the beginning and not in the end.

On the other hand, addIndexesNoOptimize does not call optimize(), and
relies on the MergeScheduler and MergePolicy to handle the merges.

After a short discussion about that on the list (Thanks Mike for the
clarifications!) I understand that there are really two core differences
between the two:

addIndexes supports IndexReader extensions
addIndexesNoOptimize performs better

This issue proposes the following:

Clear up the documentation of each, spelling out the pros/cons of
calling them clearly in the javadocs.
Rename addIndexesNoOptimize to addIndexes
Remove optimize() call from addIndexes(IndexReader...)
Document that clearly in both, w/ a recommendation to call optimize()
before on any of the Directories/Indexes if it's a concern.

That way, we maintain all the flexibility in the API -
addIndexes(IndexReader...) allows for using IR extensions,
addIndexes(Directory...) is considered more efficient, by allowing the
merges to happen concurrently (depending on MS) and also factors in the
MP. So unless you have an IR extension, addDirectories is really the one
you should be using. And you have the freedom to call optimize() before
each if you care about it, or don't if you don't care. Either way,
incurring the cost of optimize() is entirely in the user's hands.

BTW, addIndexes(IndexReader...) does not use neither the MergeScheduler
nor MergePolicy, but rather call SegmentMerger directly. This might be
another place for improvement. I'll look into it, and if it's not too
complicated, I may cover it by this issue as well. If you have any hints
that can give me a good head start on that, please don't be shy .

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-2455_3x.patch
14/May/10 13:43
56 kB
Shai Erera
LUCENE-2455_3x.patch
22/May/10 06:14
77 kB
Shai Erera
LUCENE-2455_3x.patch
23/May/10 11:16
105 kB
Shai Erera
LUCENE-2455_3x.patch
25/May/10 06:43
105 kB
Shai Erera
LUCENE-2455_3x.patch
25/May/10 16:14
105 kB
Shai Erera
LUCENE-2455_trunk.patch
27/May/10 07:45
141 kB
Shai Erera
index.31.cfs.zip
27/May/10 09:43
5 kB
Uwe Schindler
index.31.nocfs.zip
27/May/10 09:43
9 kB
Uwe Schindler

Activity

People

Assignee:: Shai Erera

Reporter:: Shai Erera

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 11/May/10 03:59

Updated:: 28/Aug/22 12:26

Resolved:: 27/May/10 15:37