Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-7703

Lucene IndexWriter Creation Failure

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.12.0
    • Component/s: lucene
    • Labels:

      Description

      While computing the index repository, the initialization might fail if there are modifications happening to the fileAndChunk region while the IndexWriter is being initialized.
      The exception stack trace varies from run to run but it always involves a IOException with different causes while reading the index file, some examples are shown below:

      Caused by: java.io.FileNotFoundException: segments_1
      	at org.apache.geode.cache.lucene.internal.filesystem.FileSystem.getFile(FileSystem.java:101)
      	at org.apache.geode.cache.lucene.internal.directory.RegionDirectory.openInput(RegionDirectory.java:115)
      	at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:137)
      	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:286)
      	at org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:165)
      	at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:974)
      	at org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.finishComputingRepository(IndexRepositoryFactory.java:130)
      	at org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.computeIndexRepository(IndexRepositoryFactory.java:67)
      	at org.apache.geode.cache.lucene.internal.IndexRepositoryFactoryDistributedTest.lambda$testBecomePrimaryWhileIndexing$566b4a0f$5(IndexRepositoryFactoryDistributedTest.java:224)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123)
      	at org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
      	at sun.rmi.transport.Transport$1.run(Transport.java:200)
      	at sun.rmi.transport.Transport$1.run(Transport.java:197)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
      	at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
      	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
      	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      	at java.lang.Thread.run(Thread.java:748)
      
      Caused by: java.io.EOFException: Read past end of file _3z.si
      	at org.apache.geode.cache.lucene.internal.directory.FileIndexInput.readByte(FileIndexInput.java:103)
      	at org.apache.lucene.store.BufferedChecksumIndexInput.readByte(BufferedChecksumIndexInput.java:41)
      	at org.apache.lucene.store.DataInput.readInt(DataInput.java:101)
      	at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:194)
      	at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255)
      	at org.apache.lucene.codecs.lucene62.Lucene62SegmentInfoFormat.read(Lucene62SegmentInfoFormat.java:93)
      	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:357)
      	at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:288)
      	... 20 more
      	Suppressed: org.apache.lucene.index.CorruptIndexException: checksum status indeterminate: unexpected exception (resource=BufferedChecksumIndexInput(_3z.si))
      		at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:471)
      		at org.apache.lucene.codecs.lucene62.Lucene62SegmentInfoFormat.read(Lucene62SegmentInfoFormat.java:252)
      		... 22 more
      	Caused by: java.io.EOFException: Read past end of file _3z.si
      		at org.apache.geode.cache.lucene.internal.directory.FileIndexInput.readBytes(FileIndexInput.java:124)
      		at org.apache.lucene.store.BufferedChecksumIndexInput.readBytes(BufferedChecksumIndexInput.java:49)
      		at org.apache.lucene.store.DataInput.readBytes(DataInput.java:87)
      		at org.apache.lucene.store.DataInput.skipBytes(DataInput.java:350)
      		at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:458)
      		... 23 more
      

      The issue itself is extremely hard to reproduce as the time window for the race to happen is rather small, the solution implies returning null from the IndexRepositoryFactory whenever the exception happens and let the caller retry (the internal logic for doing this is already in place).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                echobravo Ernest Burghardt
                Reporter:
                jjramos Juan Ramos
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m