Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-3813

Exception in datastore leads to async index stop indexing new content

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Duplicate
    • Affects Version/s: 1.2.2
    • Fix Version/s: None
    • Component/s: lucene
    • Labels:
      None

      Description

      We are using an S3 based datastore and that (for some other reasons) sometimes starts to miss certain blobs and throws an exception, see below. Unfortunately, it seems that this blocks the indexing of any new content - as the index will try again and again to index that missing binary and fail at the same point.

      It would be great if the indexing process could be more resilient against error like this. (I think the datastore implementation should probably not propagate that exception to the outside but just log it, but that's a separate issue).

      This is seen with oak 1.2.2. I had a look at the latest version on trunk but it seems the behavior has not changed since then.

      17.12.2015 20:50:26.418 -0500 *ERROR* [pool-7-thread-5] org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job execution of org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@5cc5e2f6 : Error occurred while obtaining InputStream for blobId [2832539c16b1a2e5745370ee89e41ab562436c5f#109419]
      java.lang.RuntimeException: Error occurred while obtaining InputStream for blobId [2832539c16b1a2e5745370ee89e41ab562436c5f#109419]
      	at org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:49)
      	at org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.getNewStream(SegmentBlob.java:84)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:216)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:264)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:350)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:356)
      	at org.apache.lucene.store.DataInput.readInt(DataInput.java:84)
      	at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:126)
      	at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.<init>(Lucene41PostingsReader.java:75)
      	at org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:430)
      	at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.<init>(PerFieldPostingsFormat.java:195)
      	at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:244)
      	at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:116)
      	at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:96)
      	at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:141)
      	at org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:279)
      	at org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3191)
      	at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3182)
      	at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3155)
      	at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3123)
      	at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:988)
      	at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:932)
      	at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:894)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.closeWriter(LuceneIndexEditorContext.java:169)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:190)
      	at org.apache.jackrabbit.oak.plugins.index.IndexUpdate.leave(IndexUpdate.java:221)
      	at org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:63)
      	at org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:56)
      	at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.updateIndex(AsyncIndexUpdate.java:367)
      	at org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate.run(AsyncIndexUpdate.java:312)
      	at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:105)
      	at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: java.io.IOException: org.apache.jackrabbit.core.data.DataStoreException: Could not length of dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f
      	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:465)
      	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getInputStream(DataStoreBlobStore.java:297)
      	at org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:47)
      	... 34 common frames omitted
      Caused by: org.apache.jackrabbit.core.data.DataStoreException: Could not length of dataIdentifier 2832539c16b1a2e5745370ee89e41ab562436c5f
      	at org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:474)
      	at org.apache.jackrabbit.core.data.CachingDataStore.getLength(CachingDataStore.java:669)
      	at org.apache.jackrabbit.core.data.CachingDataStore.getRecord(CachingDataStore.java:467)
      	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getDataRecord(DataStoreBlobStore.java:474)
      	at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:463)
      	... 36 common frames omitted
      Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: E29ADB7F4BE7E12F)
      	at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1078)
      	at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:726)
      	at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:461)
      	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:296)
      	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3736)
      	at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
      	at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1005)
      	at org.apache.jackrabbit.aws.ext.ds.S3Backend.getLength(S3Backend.java:467)
      	... 40 common frames omitted
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                chetanm Chetan Mehrotra
                Reporter:
                alexander.klimetschek Alexander Klimetschek
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: