Lucene - Core
  1. Lucene - Core
  2. LUCENE-3511

Unexpected behavior in maxMergeDocs from Lucene 2.9.2 to 3.4.0

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.4
    • Fix Version/s: None
    • Component/s: core/index
    • Labels:
      None
    • Environment:

      Mac OSX 10.6.8

    • Lucene Fields:
      New

      Description

      With Lucene 2.9, I used to set the maxMergeDocs to -1 to disable it. (This is also the default). Then, if I were to delete some documents and optimize/commit the index, the deletions would be removed from the index such that IndexReader.maxDoc() == IndexReader.numDocs()

      With Lucene 3.4, maxMergeDocs can be set using IndexWriterConfig. When this is set to -1, the document deletions do not get purged from the index even after optimize/commit.

      This can lead to subtle bugs where the user expects IndexReader.maxDoc() to match with IndexReader.numDocs(). In my case, when I iterated through an index after modifying (delete+add) the documents, I could only see the deleted documents.

      Unfortunately I can't isolate this into a small test case. It may have something to do with the size of Lucene records. For a small test case, I did not see *.del files appearing on the index dir. But for the case that failed, a *.del file appeared immediately after delete/commit and the file did not disappear upon commit/optimize/close of the index.

        Activity

        thushara wijeratna created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            thushara wijeratna
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development