Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-1093

ManifoldCF document reprioritization bottleneck

    XMLWordPrintableJSON

Details

    Description

      Starting a job with 200K+ documents now takes many minutes. The reason seems to be document reprioritization, which has a significant bottleneck. A thread dump shows:

      	at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.finishUp(Database.java:694)
      	at org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:728)
      	at org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:762)
      	at org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1435)
      	at org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)
      	at org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:191)
      	at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performModification(DBInterfaceHSQLDB.java:750)
      	at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performUpdate(DBInterfaceHSQLDB.java:296)
      	at org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:80)
      	at org.apache.manifoldcf.crawler.bins.BinManager.getIncrementBinValues(BinManager.java:158)
      	at org.apache.manifoldcf.crawler.reprioritizationtracker.ReprioritizationTracker.getIncrementBinValue(ReprioritizationTracker.java:328)
      	at org.apache.manifoldcf.crawler.system.PriorityCalculator.getDocumentPriority(PriorityCalculator.java:145)
      	at org.apache.manifoldcf.crawler.jobs.JobQueue.writeDocPriority(JobQueue.java:874)
      	at org.apache.manifoldcf.crawler.jobs.JobManager.writeDocumentPriorities(JobManager.java:2142)
      	at org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1121)
      	at org.apache.manifoldcf.crawler.system.ManifoldCF.resetAllDocumentPriorities(ManifoldCF.java:1054)
      	at org.apache.manifoldcf.crawler.system.StartupThread.run(StartupThread.java:141)
      

      Attachments

        Activity

          People

            kwright@metacarta.com Karl Wright
            kwright@metacarta.com Karl Wright
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: