Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-3813

Exception in datastore leads to async index stop indexing new content

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • 1.2.2
    • None
    • lucene
    • None

    Description

      We are using an S3 based datastore and that (for some other reasons) sometimes starts to miss certain blobs and throws an exception, see below. Unfortunately, it seems that this blocks the indexing of any new content - as the index will try again and again to index that missing binary and fail at the same point.

      It would be great if the indexing process could be more resilient against error like this. (I think the datastore implementation should probably not propagate that exception to the outside but just log it, but that's a separate issue).

      This is seen with oak 1.2.2. I had a look at the latest version on trunk but it seems the behavior has not changed since then.

      17.12.2015 20:50:26.418 -0500 *ERROR* [pool-7-thread-5] org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job execution of org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@5cc5e2f6 : Error occurred while obtaining InputStream for blobId [2832539c16b1a2e5745370ee89e41ab562436c5f#109419]
      java.lang.RuntimeException: Error occurred while obtaining InputStream for blobId [2832539c16b1a2e5745370ee89e41ab562436c5f#109419]
      	at org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:49)
      	at org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.getNewStream(SegmentBlob.java:84)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:216)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:264)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:350)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:356)
      	at org.apache.lucene.store.DataInput.readInt(DataInput.java:84)
      	at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:126)
      	at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.<init>(Lucene41PostingsReader.java:75)
      	at org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:430)
      	at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:195)
      	at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:244)
      	at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:116)
      	at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96)
      	at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:141)
      	at org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:279)
      	at org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3191)
      	at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3182)
      	at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3155)
      	at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3123)
      	at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:988)
      	at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:932)
      	at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:894)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.closeWriter(LuceneIndexEditorContext.java:169)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:190)
      	at org.apache.jackrabbit.oak.plugins.index.IndexUpdate.leave(IndexUpdate.java:221)
      	at org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:63)
      	at org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:56)
      	at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:367)
      	at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:312)
      	at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105)
      	at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.IOException: org.apache.jackrabbit.core.data.DataStoreException: Could not length of dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f
      	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:465)
      	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getInputStream(DataStoreBlobStore.java:297)
      	at org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:47)
      	... 34 common frames omitted
      Caused by: org.apache.jackrabbit.core.data.DataStoreException: Could not length of dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f
      	at org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:474)
      	at org.apache.jackrabbit.core.data.CachingDataStore.getLength(CachingDataStore.java:669)
      	at org.apache.jackrabbit.core.data.CachingDataStore.getRecord(CachingDataStore.java:467)
      	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getDataRecord(DataStoreBlobStore.java:474)
      	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:463)
      	... 36 common frames omitted
      Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: E29ADB7F4BE7E12F)
      	at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1078)
      	at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:726)
      	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:461)
      	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:296)
      	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3736)
      	at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
      	at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1005)
      	at org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:467)
      	... 40 common frames omitted
      

      Attachments

        Issue Links

          Activity

            People

              chetanm Chetan Mehrotra
              alexander.klimetschek Alexander Klimetschek
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: