Issue Details (XML | Word | Printable)

Key: LUCENE-1521
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Critical Critical
Assignee: Michael McCandless
Reporter: Shon Vella
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

"fdx size mismatch" exception in StoredFieldsWriter.closeDocStore() when closing index with 500M documents

Created: 16/Jan/09 03:28 PM   Updated: 25/Sep/09 04:23 PM
Return to search
Component/s: Index
Affects Version/s: 2.4
Fix Version/s: 2.4.1, 2.9

Time Tracking:
Not Specified

Issue Links:
Reference
 

Lucene Fields: New
Resolution Date: 19/Feb/09 10:10 AM


 Description  « Hide
When closing index that contains 500,000,000 randomly generated documents, an exception is thrown:

java.lang.RuntimeException: after flush: fdx size mismatch: 500000000 docs vs 4000000004 length in bytes of _0.fdx
at org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:94)
at org.apache.lucene.index.DocFieldConsumers.closeDocStore(DocFieldConsumers.java:83)
at org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:47)
at org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:367)
at org.apache.lucene.index.IndexWriter.flushDocStores(IndexWriter.java:1688)
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3518)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3442)
at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1623)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1588)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1562)
...

This appears to be a bug at StoredFieldsWriter.java:93:

if (4+state.numDocsInStore*8 != state.directory.fileLength(state.docStoreSegmentName + "." + IndexFileNames.FIELDS_INDEX_EXTENSION))

where the multiplication by 8 is causing integer overflow. The fix would be to cast state.numDocsInStore to long before multiplying.

It appears that this is another instance of the mistake that caused bug LUCENE-1519. I did a cursory seach for *8 against the code to see if there might be yet more instances of the same mistake, but found none.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Michael McCandless added a comment - 16/Jan/09 03:59 PM
Ugh, right. Plus another one (* 16) in TermVectorsTermsWriter.java. I'll fix.

Michael McCandless added a comment - 16/Jan/09 04:19 PM
Committed revision 735043. Thanks Shon!

Michael McCandless added a comment - 19/Feb/09 01:37 AM
Reopening for backport to 2.4.1.

Michael McCandless added a comment - 19/Feb/09 10:10 AM
Committed revision 745803 on 2.4 branch.

Michael McCandless added a comment - 21/May/09 09:38 AM
Note that this issue only hits an index with many (> ~268 million) docs.

Elliot Metsger added a comment - 27/Aug/09 08:32 PM - edited
I received this on 2.4.1, not sure if it is this bug or not:
Exception in thread "main" java.lang.RuntimeException: after flush: fdx size mismatch: 10 docs vs 0 length in bytes of _sl3.fdx
at org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:94)
at org.apache.lucene.index.DocFieldConsumers.closeDocStore(DocFieldConsumers.java:83)
at org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:47)
at org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:367)
at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:567)
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3540)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3450)
at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:3363)
at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3408)
at edu.jhu.library.ivoa.VOImageAccessUrlDownload.go(VOImageAccessUrlDownload.java:357)
at edu.jhu.library.ivoa.VOImageAccessUrlDownload.main(VOImageAccessUrlDownload.java:103)

I'm working with over 500,000 docs in this particular index.


Elliot Metsger added a comment - 27/Aug/09 08:44 PM
Nevermind, it doesn't look like this is an occurrence of this bug. Not sure what happened... underlying storage is a ZFS file system. Anyway, this thread http://www.mail-archive.com/solr-user@lucene.apache.org/msg22264.html was helpful, explaining what may be happening.

Michael McCandless added a comment - 28/Aug/09 08:52 AM
Is there any thing in your env that might be removing index files out from under the IndexWriter? Are you changing your Directory's default locking impl, or disabling locking?

ZFS should be fine – I use it in my daily development. What a fabulous file system Snapshots & clones are very addictive...