Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-3511

Unexpected behavior in maxMergeDocs from Lucene 2.9.2 to 3.4.0

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.4
    • None
    • core/index
    • None
    • Mac OSX 10.6.8

    • New

    Description

      With Lucene 2.9, I used to set the maxMergeDocs to -1 to disable it. (This is also the default). Then, if I were to delete some documents and optimize/commit the index, the deletions would be removed from the index such that IndexReader.maxDoc() == IndexReader.numDocs()

      With Lucene 3.4, maxMergeDocs can be set using IndexWriterConfig. When this is set to -1, the document deletions do not get purged from the index even after optimize/commit.

      This can lead to subtle bugs where the user expects IndexReader.maxDoc() to match with IndexReader.numDocs(). In my case, when I iterated through an index after modifying (delete+add) the documents, I could only see the deleted documents.

      Unfortunately I can't isolate this into a small test case. It may have something to do with the size of Lucene records. For a small test case, I did not see *.del files appearing on the index dir. But for the case that failed, a *.del file appeared immediately after delete/commit and the file did not disappear upon commit/optimize/close of the index.

      Attachments

        Activity

          People

            Unassigned Unassigned
            thushara thushara wijeratna
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: