Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Not A Problem
-
3.0.1
-
None
-
None
-
CentOS 5.4
-
New
Description
I just saw the following error:
java.lang.RuntimeException: after flush: fdx size mismatch: -512764976 docs vs 30257618564 length in bytes of _0.fdx file exists?=true
at org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:97)
at org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:51)
at org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:371)
at org.apache.lucene.index.IndexWriter.flushDocStores(IndexWriter.java:1724)
at org.apache.lucene.index.IndexWriter.doFlushInternal(IndexWriter.java:3565)
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3491)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3482)
at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1658)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1621)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1585)
Note the negative SegmentWriteState.numDocsInStore. I assume this is because Lucene has a limit of 2 ^ 31 - 1 = 2147483647 (sizeof(int)) documents per index, though I couldn't find this documented clearly anywhere. It would have been nice to get this error earlier, back when I exceeded the limit, rather than now, after a bunch of indexing that was apparently doomed to fail.
Hence, two suggestions:
- State clearly somewhere that the maximum number of documents in a Lucene index is sizeof(int).
- Throw an exception when an IndexWriter first exceeds this number rather than only on close.