Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
While computing the index repository, the initialization might fail if there are modifications happening to the fileAndChunk region while the IndexWriter is being initialized.
The exception stack trace varies from run to run but it always involves a IOException with different causes while reading the index file, some examples are shown below:
Caused by: java.io.FileNotFoundException: segments_1 at org.apache.geode.cache.lucene.internal.filesystem.FileSystem.getFile(FileSystem.java:101) at org.apache.geode.cache.lucene.internal.directory.RegionDirectory.openInput(RegionDirectory.java:115) at org.apache.lucene.store.Directory.openChecksumInput(Directory.java:137) at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:286) at org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:165) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:974) at org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.finishComputingRepository(IndexRepositoryFactory.java:130) at org.apache.geode.cache.lucene.internal.IndexRepositoryFactory.computeIndexRepository(IndexRepositoryFactory.java:67) at org.apache.geode.cache.lucene.internal.IndexRepositoryFactoryDistributedTest.lambda$testBecomePrimaryWhileIndexing$566b4a0f$5(IndexRepositoryFactoryDistributedTest.java:224) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.geode.test.dunit.internal.MethodInvoker.executeObject(MethodInvoker.java:123) at org.apache.geode.test.dunit.internal.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:78) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.EOFException: Read past end of file _3z.si at org.apache.geode.cache.lucene.internal.directory.FileIndexInput.readByte(FileIndexInput.java:103) at org.apache.lucene.store.BufferedChecksumIndexInput.readByte(BufferedChecksumIndexInput.java:41) at org.apache.lucene.store.DataInput.readInt(DataInput.java:101) at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:194) at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) at org.apache.lucene.codecs.lucene62.Lucene62SegmentInfoFormat.read(Lucene62SegmentInfoFormat.java:93) at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:357) at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:288) ... 20 more Suppressed: org.apache.lucene.index.CorruptIndexException: checksum status indeterminate: unexpected exception (resource=BufferedChecksumIndexInput(_3z.si)) at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:471) at org.apache.lucene.codecs.lucene62.Lucene62SegmentInfoFormat.read(Lucene62SegmentInfoFormat.java:252) ... 22 more Caused by: java.io.EOFException: Read past end of file _3z.si at org.apache.geode.cache.lucene.internal.directory.FileIndexInput.readBytes(FileIndexInput.java:124) at org.apache.lucene.store.BufferedChecksumIndexInput.readBytes(BufferedChecksumIndexInput.java:49) at org.apache.lucene.store.DataInput.readBytes(DataInput.java:87) at org.apache.lucene.store.DataInput.skipBytes(DataInput.java:350) at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:458) ... 23 more
The issue itself is extremely hard to reproduce as the time window for the race to happen is rather small, the solution implies returning null from the IndexRepositoryFactory whenever the exception happens and let the caller retry (the internal logic for doing this is already in place).
Attachments
Issue Links
- fixes
-
GEODE-7516 RebalanceWithRedundancyDUnitTest.returnCorrectResultsWhenIndexUpdateHappensIntheMiddleofGII(PARTITION_REDUNDANT)
- Closed
- is related to
-
GEODE-8536 StackOverflow can occur when Lucene IndexWriter is unable to be created
- Closed
- links to