Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Duplicate
-
1.2.2
-
None
-
None
Description
We are using an S3 based datastore and that (for some other reasons) sometimes starts to miss certain blobs and throws an exception, see below. Unfortunately, it seems that this blocks the indexing of any new content - as the index will try again and again to index that missing binary and fail at the same point.
It would be great if the indexing process could be more resilient against error like this. (I think the datastore implementation should probably not propagate that exception to the outside but just log it, but that's a separate issue).
This is seen with oak 1.2.2. I had a look at the latest version on trunk but it seems the behavior has not changed since then.
17.12.2015 20:50:26.418 -0500 *ERROR* [pool-7-thread-5] org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job execution of org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@5cc5e2f6 : Error occurred while obtaining InputStream for blobId [2832539c16b1a2e5745370ee89e41ab562436c5f#109419] java.lang.RuntimeException: Error occurred while obtaining InputStream for blobId [2832539c16b1a2e5745370ee89e41ab562436c5f#109419] at org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:49) at org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.getNewStream(SegmentBlob.java:84) at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:216) at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:264) at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:350) at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:356) at org.apache.lucene.store.DataInput.readInt(DataInput.java:84) at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:126) at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.<init>(Lucene41PostingsReader.java:75) at org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:430) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:195) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:244) at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:116) at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96) at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:141) at org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:279) at org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3191) at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3182) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3155) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3123) at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:988) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:932) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:894) at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.closeWriter(LuceneIndexEditorContext.java:169) at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:190) at org.apache.jackrabbit.oak.plugins.index.IndexUpdate.leave(IndexUpdate.java:221) at org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:63) at org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:56) at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:367) at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:312) at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: org.apache.jackrabbit.core.data.DataStoreException: Could not length of dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:465) at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getInputStream(DataStoreBlobStore.java:297) at org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:47) ... 34 common frames omitted Caused by: org.apache.jackrabbit.core.data.DataStoreException: Could not length of dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f at org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:474) at org.apache.jackrabbit.core.data.CachingDataStore.getLength(CachingDataStore.java:669) at org.apache.jackrabbit.core.data.CachingDataStore.getRecord(CachingDataStore.java:467) at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getDataRecord(DataStoreBlobStore.java:474) at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:463) ... 36 common frames omitted Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: E29ADB7F4BE7E12F) at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1078) at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:726) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:461) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:296) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3736) at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027) at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1005) at org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:467) ... 40 common frames omitted