Lucene - Core
  1. Lucene - Core
  2. LUCENE-1521

"fdx size mismatch" exception in StoredFieldsWriter.closeDocStore() when closing index with 500M documents

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.4
    • Fix Version/s: 2.4.1, 2.9
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      When closing index that contains 500,000,000 randomly generated documents, an exception is thrown:

      java.lang.RuntimeException: after flush: fdx size mismatch: 500000000 docs vs 4000000004 length in bytes of _0.fdx
      at org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:94)
      at org.apache.lucene.index.DocFieldConsumers.closeDocStore(DocFieldConsumers.java:83)
      at org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:47)
      at org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:367)
      at org.apache.lucene.index.IndexWriter.flushDocStores(IndexWriter.java:1688)
      at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3518)
      at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3442)
      at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1623)
      at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1588)
      at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1562)
      ...

      This appears to be a bug at StoredFieldsWriter.java:93:

      if (4+state.numDocsInStore*8 != state.directory.fileLength(state.docStoreSegmentName + "." + IndexFileNames.FIELDS_INDEX_EXTENSION))

      where the multiplication by 8 is causing integer overflow. The fix would be to cast state.numDocsInStore to long before multiplying.

      It appears that this is another instance of the mistake that caused bug LUCENE-1519. I did a cursory seach for *8 against the code to see if there might be yet more instances of the same mistake, but found none.

        Issue Links

          Activity

          Hide
          Michael McCandless added a comment -

          Is there any thing in your env that might be removing index files out from under the IndexWriter? Are you changing your Directory's default locking impl, or disabling locking?

          ZFS should be fine – I use it in my daily development. What a fabulous file system Snapshots & clones are very addictive...

          Show
          Michael McCandless added a comment - Is there any thing in your env that might be removing index files out from under the IndexWriter? Are you changing your Directory's default locking impl, or disabling locking? ZFS should be fine – I use it in my daily development. What a fabulous file system Snapshots & clones are very addictive...
          Hide
          Elliot Metsger added a comment -

          Nevermind, it doesn't look like this is an occurrence of this bug. Not sure what happened... underlying storage is a ZFS file system. Anyway, this thread http://www.mail-archive.com/solr-user@lucene.apache.org/msg22264.html was helpful, explaining what may be happening.

          Show
          Elliot Metsger added a comment - Nevermind, it doesn't look like this is an occurrence of this bug. Not sure what happened... underlying storage is a ZFS file system. Anyway, this thread http://www.mail-archive.com/solr-user@lucene.apache.org/msg22264.html was helpful, explaining what may be happening.
          Hide
          Elliot Metsger added a comment - - edited

          I received this on 2.4.1, not sure if it is this bug or not:
          Exception in thread "main" java.lang.RuntimeException: after flush: fdx size mismatch: 10 docs vs 0 length in bytes of _sl3.fdx
          at org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:94)
          at org.apache.lucene.index.DocFieldConsumers.closeDocStore(DocFieldConsumers.java:83)
          at org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:47)
          at org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:367)
          at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:567)
          at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3540)
          at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3450)
          at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:3363)
          at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3408)
          at edu.jhu.library.ivoa.VOImageAccessUrlDownload.go(VOImageAccessUrlDownload.java:357)
          at edu.jhu.library.ivoa.VOImageAccessUrlDownload.main(VOImageAccessUrlDownload.java:103)

          I'm working with over 500,000 docs in this particular index.

          Show
          Elliot Metsger added a comment - - edited I received this on 2.4.1, not sure if it is this bug or not: Exception in thread "main" java.lang.RuntimeException: after flush: fdx size mismatch: 10 docs vs 0 length in bytes of _sl3.fdx at org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:94) at org.apache.lucene.index.DocFieldConsumers.closeDocStore(DocFieldConsumers.java:83) at org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:47) at org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:367) at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:567) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3540) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3450) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:3363) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3408) at edu.jhu.library.ivoa.VOImageAccessUrlDownload.go(VOImageAccessUrlDownload.java:357) at edu.jhu.library.ivoa.VOImageAccessUrlDownload.main(VOImageAccessUrlDownload.java:103) I'm working with over 500,000 docs in this particular index.
          Hide
          Michael McCandless added a comment -

          Note that this issue only hits an index with many (> ~268 million) docs.

          Show
          Michael McCandless added a comment - Note that this issue only hits an index with many (> ~268 million) docs.
          Hide
          Michael McCandless added a comment -

          Committed revision 745803 on 2.4 branch.

          Show
          Michael McCandless added a comment - Committed revision 745803 on 2.4 branch.
          Hide
          Michael McCandless added a comment -

          Reopening for backport to 2.4.1.

          Show
          Michael McCandless added a comment - Reopening for backport to 2.4.1.
          Hide
          Michael McCandless added a comment -

          Committed revision 735043. Thanks Shon!

          Show
          Michael McCandless added a comment - Committed revision 735043. Thanks Shon!
          Hide
          Michael McCandless added a comment -

          Ugh, right. Plus another one (* 16) in TermVectorsTermsWriter.java. I'll fix.

          Show
          Michael McCandless added a comment - Ugh, right. Plus another one (* 16) in TermVectorsTermsWriter.java. I'll fix.

            People

            • Assignee:
              Michael McCandless
              Reporter:
              Shon Vella
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development