[LUCENE-4203] Add IndexWriter.tryDeleteDocument, to delete by document id when possible - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 4.0-BETA, 6.0
Component/s: core/index
Labels:
None

Lucene Fields:

New

Description

Spinoff from ~~LUCENE-4069~~.

In that use case, where the app needs to first lookup a document, then
call updateDocument, it's wasteful today because the relatively costly
lookup (by a primary key field, eg "id") is done twice.

But, since you already resolved the PK to docID on the first lookup,
it would be nice to then delete by that docID and then you can call
addDocument instead.

So I worked out a rough start at this, by adding
IndexWriter.tryDeleteDocument. It'd be a very expert API: it takes a
SegmentInfo (referencing the segment that contains the docID), and as
long as that segment hasn't yet been merged away, it will mark the
document for deletion and return true (success). If it has been
merged away it returns false and the app must then delete-by-term. It
only works if the writer is in NRT mode (ie you've opened an NRT
reader).

In ~~LUCENE-4069~~ using tryDeleteDocument gave a ~20% net speedup.

I think tryDeleteDocument would also be useful when Solr "updates" a
document by loading all stored fields, changing them, and calling
updateDocument.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

LUCENE-4203.patch
09/Jul/12 19:26
3 kB
Michael McCandless
LUCENE-4203.patch
30/Jul/12 18:39
10 kB
Michael McCandless

Activity

People

Assignee:: Unassigned

Reporter:: Michael McCandless

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 09/Jul/12 19:21

Updated:: 28/Aug/22 13:21

Resolved:: 02/Aug/12 23:06