Issue Details (XML | Word | Printable)

Key: LUCENE-1168
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Michael McCandless
Reporter: Michael McCandless
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

TermVectors index files can become corrupt when autoCommit=false

Created: 07/Feb/08 02:52 PM   Updated: 24/Feb/08 12:40 AM
Return to search
Component/s: Index
Affects Version/s: 2.3
Fix Version/s: 2.3.1, 2.4

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works LUCENE-1168.patch 2008-02-07 03:08 PM Michael McCandless 8 kB
Issue Links:
Blocker
 

Lucene Fields: New
Resolution Date: 07/Feb/08 09:15 PM


 Description  « Hide
Spinoff from this thread:

http://www.gossamer-threads.com/lists/lucene/java-dev/55951

There are actually 2 separate cases here, both only happening when
autoCommit=false:

  • First issue was caused by LUCENE-843 (sigh): if you add a bunch of
    docs with no term vectors, such that 1 or more flushes happen;
    then you add docs that do have term vectors, the tvx file will not
    have enough entries (= corruption).
  • Second issue was caused by bulk merging of term vectors
    (LUCENE-1120 – only in trunk) and bulk merging of stored fields
    (LUCENE-1043, in 2.3), and only shows when autoCommit=false, and,
    the bulk merging optimization runs. In this case, the code that
    reads the rawDocs tries to read too far in the tvx/fdx files (it's
    not really index corruption but rather a bug in the rawDocs
    reading).


 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Repository Revision Date User Message
ASF #619640 Thu Feb 07 21:13:36 UTC 2008 mikemccand LUCENE-1168: fix corruption cases with mixed term vectors and autoCommit=false
Files Changed
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/DocumentsWriter.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/TermVectorsReader.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/TestIndexWriter.java
MODIFY /lucene/java/trunk/CHANGES.txt

Repository Revision Date User Message
ASF #620747 Tue Feb 12 10:41:37 UTC 2008 mikemccand LUCENE-1168 (backport to 2.3 branch): Fixed corruption cases when autoCommit=false and documents have mixed term vectors
Files Changed
MODIFY /lucene/java/branches/lucene_2_3/src/java/org/apache/lucene/index/FieldsReader.java
MODIFY /lucene/java/branches/lucene_2_3/src/java/org/apache/lucene/index/DocumentsWriter.java
MODIFY /lucene/java/branches/lucene_2_3/src/test/org/apache/lucene/index/TestIndexWriter.java
MODIFY /lucene/java/branches/lucene_2_3/CHANGES.txt