Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-9184

Very slow, potential endless loop in LucenePropertyIndex.loadDocs()

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.22.4, 1.32.0, 1.8.23
    • 1.34.0, 1.22.5, 1.8.24
    • lucene
    • None

    Description

      We found one case of a very slow, possibly endless loop in this method. 

      java.lang.Thread.State: RUNNABLE at org.apache.lucene.search.TopScoreDocCollector$InOrderPagingScoreDocCollector.collect(TopScoreDocCollector.java:85) at org.apache.lucene.search.Scorer.score(Scorer.java:65) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448) at org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:243) at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.loadDocs(LucenePropertyIndex.java:434) at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:353)
      

      Background: Many threads are waiting to lock here:

      	- waiting to lock <0x00007fbd804a9448> (a org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.acquireIndexNode(IndexTracker.java:202)
      
          public IndexNode acquireIndexNode(String path) {
              IndexNodeManager index = indices.get(path);
              IndexNode indexNode = index != null ? index.acquire() : null;
              if (indexNode != null) {
                  return indexNode;
              } else {
                  return findIndexNode(path); <<= synchronized method
              }
          }
      

      And the thread that is holding that (synchronized) lock on IndexTracker is also waiting to get a write lock here:

      "oak-lucene-992" #21559 daemon prio=1 os_prio=0 tid=0x00007ed0280ad000 nid=0x1be8 waiting on condition [0x00007fb96c352000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x00007fbedad77210> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.IndexNodeManager.close(IndexNodeManager.java:174)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.close(IndexTracker.java:104)
      	- locked <0x00007fbd804a9448> (a org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.update(IndexTracker.java:114)
      	- locked <0x00007fbd804a9448> (a org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker)
      
        void close() throws IOException {
              lock.writeLock().lock(); <<== waiting here
              try {
                  checkState(!closed);
                  closed = true;
              } finally {
                  lock.writeLock().unlock();
              }
      

      It looks like a read lock is not released. The places where a read lock is acquired:

          IndexNode acquire() {
              lock.readLock().lock();
        (released if the method returns null or throws an exception)
      
          private void release() {
              lock.readLock().unlock();
          }
      

      acquire() is called in acquireIndexNode:

          public IndexNode acquireIndexNode(String path) {
              IndexNodeManager index = indices.get(path);
              IndexNode indexNode = index != null ? index.acquire() : null; <<== here
              if (indexNode != null) {
                  return indexNode;
              } else {
                  return findIndexNode(path);
              }
          }
      
          private synchronized IndexNode findIndexNode(String path) {
              // Retry the lookup from acquireIndexNode now that we're
              // synchronized. The acquire() call is guaranteed to succeed
              // since the close() method is also synchronized.
              IndexNodeManager index = indices.get(path);
              if (index != null) {
                  IndexNode indexNode = index.acquire(); <<== here
                  return checkNotNull(indexNode);
              }
      
      

      This method is called in multiple places (getPlanDescription, getPlans, loadDocs, getSize, getOldFullTextIndexPath, dumpIndexContent, getFieldInfo, getFieldTermPrefixInfo, getIndexedPaths, getIndexStats) but always released in "finally".

      There is one thread that seems to hold this read lock:

      "10.13.0.73 [1597750655085] GET /mnt/overlay/granite/ui/content/shell/omnisearch/searchresults.html HTTP/1.1" #10953 prio=5 os_prio=0 tid=0x00007fb97000f000 nid=0x8b59 runnable [0x00007fb96c44c000]
         java.lang.Thread.State: RUNNABLE
      	at org.apache.lucene.search.TopScoreDocCollector$InOrderPagingScoreDocCollector.collect(TopScoreDocCollector.java:85)
      	at org.apache.lucene.search.Scorer.score(Scorer.java:65)
      	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
      	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:491)
      	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:448)
      	at org.apache.lucene.search.IndexSearcher.searchAfter(IndexSearcher.java:243)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.loadDocs(LucenePropertyIndex.java:434)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:353)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:339)
      	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
      	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$LucenePathCursor$1.hasNext(LucenePropertyIndex.java:1718)
      	at com.google.common.collect.Iterators$7.computeNext(Iterators.java:645)
      	at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
      	at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
      	at org.apache.jackrabbit.oak.plugins.index.Cursors$PathCursor.hasNext(Cursors.java:216)
      	at org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$LucenePathCursor.hasNext(LucenePropertyIndex.java:1751)
      	at org.apache.jackrabbit.oak.query.ast.SelectorImpl.next(SelectorImpl.java:432)
      	at org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.fetchNext(QueryImpl.java:824)
      	at org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.hasNext(QueryImpl.java:851)
      	at com.google.common.collect.Iterators$5.hasNext(Iterators.java:542)
      	at org.apache.jackrabbit.oak.query.FilterIterators$DistinctIterator.fetchNext(FilterIterators.java:141)
      	at org.apache.jackrabbit.oak.query.FilterIterators$DistinctIterator.hasNext(FilterIterators.java:155)
      	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.fetch(QueryResultImpl.java:98)
      	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:123)
      	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.next(QueryResultImpl.java:78)
      	at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate$SynchronizedIterator.next(SessionDelegate.java:702)
      	at org.apache.jackrabbit.oak.jcr.query.PrefetchIterator.size(PrefetchIterator.java:135)
      	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$3.getSize(QueryResultImpl.java:143)
      	at org.apache.jackrabbit.oak.jcr.query.QueryResultImpl.getRows(QueryResultImpl.java:140)
      

      Attachments

        Activity

          People

            thomasm Thomas Mueller
            thomasm Thomas Mueller
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: