Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-2934

Certain searches cause lucene index to hit OutOfMemoryError

    XMLWordPrintableJSON

    Details

      Description

      Certain search terms can get split into very small wildcard tokens that will match a huge amount of items from the index, finally resulting in a OOME.

      For example

      /jcr:root//*[jcr:contains(., 'U=1*')]
      

      will translate into the following lucene query

      :fulltext:"u ( [set of all index terms stating with '1'] )"
      

      this will break down when lucene will try to compute the score for the huge set of tokens:

      java.lang.OutOfMemoryError: Java heap space
              at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.<init>(OakDirectory.java:201)
              at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.<init>(OakDirectory.java:155)
              at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.<init>(OakDirectory.java:340)
              at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.clone(OakDirectory.java:345)
              at org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.clone(OakDirectory.java:329)
              at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$BlockDocsAndPositionsEnum.<init>(Lucene41PostingsReader.java:613)
              at org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.docsAndPositions(Lucene41PostingsReader.java:252)
              at org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.docsAndPositions(BlockTreeTermsReader.java:2233)
              at org.apache.lucene.search.UnionDocsAndPositionsEnum.<init>(MultiPhraseQuery.java:492)
              at org.apache.lucene.search.MultiPhraseQuery$MultiPhraseWeight.scorer(MultiPhraseQuery.java:205)
              at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
              at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
              at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
              at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:281)
              at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:269)
              at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndex$1.loadDocs(LuceneIndex.java:352)
              at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndex$1.computeNext(LuceneIndex.java:289)
              at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndex$1.computeNext(LuceneIndex.java:280)
              at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
              at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
              at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndex$LucenePathCursor$1.hasNext(LuceneIndex.java:1026)
              at com.google.common.collect.Iterators$7.computeNext(Iterators.java:645)
              at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
              at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
              at org.apache.jackrabbit.oak.spi.query.Cursors$PathCursor.hasNext(Cursors.java:198)
              at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndex$LucenePathCursor.hasNext(LuceneIndex.java:1047)
              at org.apache.jackrabbit.oak.plugins.index.aggregate.AggregationCursor.fetchNext(AggregationCursor.java:88)
              at org.apache.jackrabbit.oak.plugins.index.aggregate.AggregationCursor.hasNext(AggregationCursor.java:75)
              at org.apache.jackrabbit.oak.spi.query.Cursors$ConcatCursor.fetchNext(Cursors.java:474)
              at org.apache.jackrabbit.oak.spi.query.Cursors$ConcatCursor.hasNext(Cursors.java:466)
              at org.apache.jackrabbit.oak.spi.query.Cursors$ConcatCursor.fetchNext(Cursors.java:474)
              at org.apache.jackrabbit.oak.spi.query.Cursors$ConcatCursor.hasNext(Cursors.java:466)
      

        Attachments

        1. LuceneIndex.java.patch
          6 kB
          Alex Deparvu

          Activity

            People

            • Assignee:
              stillalex Alex Deparvu
              Reporter:
              stillalex Alex Deparvu
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: