Issue Details (XML | Word | Printable)

Key: LUCENE-140
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Michael McCandless
Reporter: legez
Votes: 5
Watchers: 5
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

docs out of order

Created: 07/Oct/03 08:05 PM   Updated: 27/Feb/07 06:10 PM
Return to search
Component/s: Index
Affects Version/s: unspecified
Fix Version/s: 2.1

Time Tracking:
Not Specified

File Attachments:
  Size
Text File bug23650.txt 2005-06-14 11:46 PM Arvind Srinivasan 2 kB
File Licensed for inclusion in ASF works corrupted.part1.rar 2006-01-21 07:41 AM Jarrod Cuzens 9.00 MB
File Licensed for inclusion in ASF works corrupted.part2.rar 2006-01-21 07:44 AM Jarrod Cuzens 3.35 MB
Text File indexing-failure.log 2007-01-10 12:50 AM Jed Wesley-Smith 2.75 MB
Text File Licensed for inclusion in ASF works LUCENE-140-2007-01-09-instrumentation.patch 2007-01-09 02:49 PM Michael McCandless 6 kB
Environment:
Operating System: Linux
Platform: PC

Bugzilla Id: 23650
Resolution Date: 11/Jan/07 12:14 PM


 Description  « Hide
Hello,
I can not find out, why (and what) it is happening all the time. I got an
exception:
java.lang.IllegalStateException: docs out of order
at
org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:219)
at
org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:191)
at
org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:172)
at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:135)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:88)
at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:341)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:250)
at Optimize.main(Optimize.java:29)

It happens either in 1.2 and 1.3rc1 (anyway what happened to it? I can not find
it neither in download nor in version list in this form). Everything seems OK. I
can search through index, but I can not optimize it. Even worse after this
exception every time I add new documents and close IndexWriter new segments is
created! I think it has all documents added before, because of its size.

My index is quite big: 500.000 docs, about 5gb of index directory.

It is repeatable. I drop index, reindex everything. Afterwards I add a few
docs, try to optimize and receive above exception.

My documents' structure is:
static Document indexIt(String id_strony, Reader reader, String data_wydania,
String id_wydania, String id_gazety, String data_wstawienia)
{
Document doc = new Document();

doc.add(Field.Keyword("id", id_strony ));
doc.add(Field.Keyword("data_wydania", data_wydania));
doc.add(Field.Keyword("id_wydania", id_wydania));
doc.add(Field.Text("id_gazety", id_gazety));
doc.add(Field.Keyword("data_wstawienia", data_wstawienia));
doc.add(Field.Text("tresc", reader));

return doc;
}

Sincerely,
legez



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Repository Revision Date User Message
ASF #494136 Mon Jan 08 18:11:08 UTC 2007 mikemccand LUCENE-140: Add bounds checking to BitVector's get, set, clear methods
to prevent index corruption on calling IndexReader.deleteDocument(int
docNum) on a "slightly" out of bounds docNum. Other changes:

  * In IndexReader.deleteDocument, set hasChanges to true before
    calling doDelete in case an Exception is hit in doDelete.

  * Changed the "docs out of order" check to be tighter (<= instead of
    <) to catch boundary case that was missed.

  * Fixed small unrelated javadoc typo.
Files Changed
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/IndexReader.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/TestIndexReader.java
MODIFY /lucene/java/trunk/CHANGES.txt
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/util/BitVector.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/SegmentMerger.java

Repository Revision Date User Message
ASF #494933 Wed Jan 10 19:06:36 UTC 2007 mikemccand LUCENE-140:

  - Add 2 more checks on initializing SegmentReader that raise
    IllegalStateException if corruption is detected. This would have
    caught the second cause in LUCENE-140 (incorrectly re-using old
    .del files) earlier.

  - Fixed bugs in two unit tests that tripped up on these new checks.

  - Fixed (tightened) one more boundary case (when lastDoc was 0) in
    the pre-existing "docs out of order" check in SegmentMerger.java.

  - Simplified the unit test I added to TestIndexReader to test this
    issue.
Files Changed
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/TestSegmentTermDocs.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/TestDoc.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/SegmentReader.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/TestIndexReader.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/SegmentMerger.java