Lucene - Core
  1. Lucene - Core
  2. LUCENE-2420

"fdx size mismatch" overflow causes RuntimeException

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 3.0.1
    • Fix Version/s: None
    • Component/s: core/index
    • Labels:
      None
    • Environment:

      CentOS 5.4

    • Lucene Fields:
      New

      Description

      I just saw the following error:

      java.lang.RuntimeException: after flush: fdx size mismatch: -512764976 docs vs 30257618564 length in bytes of _0.fdx file exists?=true
      at org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:97)
      at org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:51)
      at org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:371)
      at org.apache.lucene.index.IndexWriter.flushDocStores(IndexWriter.java:1724)
      at org.apache.lucene.index.IndexWriter.doFlushInternal(IndexWriter.java:3565)
      at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3491)
      at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3482)
      at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1658)
      at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1621)
      at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1585)

      Note the negative SegmentWriteState.numDocsInStore. I assume this is because Lucene has a limit of 2 ^ 31 - 1 = 2147483647 (sizeof(int)) documents per index, though I couldn't find this documented clearly anywhere. It would have been nice to get this error earlier, back when I exceeded the limit, rather than now, after a bunch of indexing that was apparently doomed to fail.

      Hence, two suggestions:

      • State clearly somewhere that the maximum number of documents in a Lucene index is sizeof(int).
      • Throw an exception when an IndexWriter first exceeds this number rather than only on close.

        Activity

        Steven Bethard created issue -
        Mark Thomas made changes -
        Field Original Value New Value
        Workflow jira [ 12509617 ] Default workflow, editable Closed status [ 12563580 ]
        Mark Thomas made changes -
        Workflow Default workflow, editable Closed status [ 12563580 ] jira [ 12584315 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Steven Bethard
          • Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development