Uploaded image for project: 'ManifoldCF'
  1. ManifoldCF
  2. CONNECTORS-1094

Slow reprioritization impedes startup

    XMLWordPrintableJSON

Details

    Description

      With the latest revisions, documents for all jobs (legacy and new) do get
      picked up and processed, which is great! This was verified on a small
      1-node test system.
      I have since applied the fix to a much larger environment (29M docs across
      4 MCF agents using a 3-node Zookeeper cluster) which has a bunch of
      mid-sized (100,000s docs) jobs in a Running state. The update of the
      priorityset field for ~36M jobqueue records took just over an hour. More
      problematically for me is the rate of reprioritization on startup which was
      very slow - nearly 2 hours to update ~600,000 records.

      A couple of SQL queries
      (JobManager#getNextNotYetProcessedRepriotizationDocuments and
      ManifoldCF#writeDocumentPriorities) come up frequently, but a VisualVM
      profile of the MCF agent shows the majority of the Agents thread's time is
      spent talking to ZK, for locking + reading some config data very frequently

      • see the snapshots below.

      Is it possible to avoid the per-document locking pattern seen in this case?

      "Agents thread" - Thread t@21
         java.lang.Thread.State: WAITING
          at java.lang.Object.wait(Native Method)
          - waiting on <487ef1bb> (a org.apache.zookeeper.ClientCnxn$Packet)
          at java.lang.Object.wait(Object.java:503)
          at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
          at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
          at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1180)
          at
      org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.readData(ZooKeeperConnection.java:819)
          at
      org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager.getSharedConfiguration(ZooKeeperLockManager.java:670)
          at
      org.apache.manifoldcf.core.interfaces.LockManagerFactory.getBooleanProperty(LockManagerFactory.java:110)
          at
      org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.setThreadContext(SharedDriveConnector.java:157)
          at
      org.apache.manifoldcf.core.connectorpool.ConnectorPool$Pool.getConnector(ConnectorPool.java:489)
          - locked <3f2843d4> (a
      org.apache.manifoldcf.core.connectorpool.ConnectorPool$Pool)
          at
      org.apache.manifoldcf.core.connectorpool.ConnectorPool.grab(ConnectorPool.java:255)
          at
      org.apache.manifoldcf.crawler.repositoryconnectorpool.RepositoryConnectorPool.grab(RepositoryConnectorPool.java:86)
          at
      org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1007)
          at
      org.apache.manifoldcf.crawler.system.ManifoldCF.resetAllDocumentPriorities(ManifoldCF.java:960)
          at
      org.apache.manifoldcf.crawler.system.CrawlerAgent.cleanUpAllAgentData(CrawlerAgent.java:155)
          at
      org.apache.manifoldcf.agents.system.AgentsDaemon$CleanupAgent.cleanUpAllServices(AgentsDaemon.java:356)
          at
      org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager.registerServiceBeginServiceActivity(ZooKeeperLockManager.java:203)
      
      "Agents thread" - Thread t@21
         java.lang.Thread.State: WAITING
          at java.lang.Object.wait(Native Method)
          - waiting on <52698c72> (a org.apache.zookeeper.ClientCnxn$Packet)
          at java.lang.Object.wait(Object.java:503)
          at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
          at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:781)
          at
      org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.createSequentialChild(ZooKeeperConnection.java:1116)
          at
      org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.obtainReadLock(ZooKeeperConnection.java:691)
          at
      org.apache.manifoldcf.core.lockmanager.ZooKeeperLockObject.obtainGlobalReadLock(ZooKeeperLockObject.java:193)
          at
      org.apache.manifoldcf.core.lockmanager.LockObject.enterReadLock(LockObject.java:310)
          - locked <151db932> (a
      org.apache.manifoldcf.core.lockmanager.ZooKeeperLockObject)
          at
      org.apache.manifoldcf.core.lockmanager.LockGate.enterReadLock(LockGate.java:261)
          at
      org.apache.manifoldcf.core.lockmanager.BaseLockManager.enterRead(BaseLockManager.java:1283)
          at
      org.apache.manifoldcf.core.lockmanager.BaseLockManager.enterReadLock(BaseLockManager.java:790)
          at
      org.apache.manifoldcf.crawler.reprioritizationtracker.ReprioritizationTracker.getMinimumDepth(ReprioritizationTracker.java:251)
          at
      org.apache.manifoldcf.crawler.system.PriorityCalculator.<init>(PriorityCalculator.java:89)
          at
      org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1021)
          at
      org.apache.manifoldcf.crawler.system.ManifoldCF.resetAllDocumentPriorities(ManifoldCF.java:960)
          at
      org.apache.manifoldcf.crawler.system.CrawlerAgent.cleanUpAllAgentData(CrawlerAgent.java:155)
      
      "Agents thread" - Thread t@21
         java.lang.Thread.State: WAITING
          at java.lang.Object.wait(Native Method)
          - waiting on <79c64d6d> (a org.apache.zookeeper.ClientCnxn$Packet)
          at java.lang.Object.wait(Object.java:503)
          at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
          at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:871)
          at
      org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.releaseLock(ZooKeeperConnection.java:796)
          at
      org.apache.manifoldcf.core.lockmanager.ZooKeeperLockObject.clearLock(ZooKeeperLockObject.java:218)
          at
      org.apache.manifoldcf.core.lockmanager.ZooKeeperLockObject.clearGlobalReadLockNoWait(ZooKeeperLockObject.java:212)
          at
      org.apache.manifoldcf.core.lockmanager.LockObject.clearGlobalReadLock(LockObject.java:395)
          at
      org.apache.manifoldcf.core.lockmanager.LockObject.leaveReadLock(LockObject.java:376)
          - locked <126e1776> (a
      org.apache.manifoldcf.core.lockmanager.ZooKeeperLockObject)
          at
      org.apache.manifoldcf.core.lockmanager.LockGate.leaveReadLock(LockGate.java:289)
          - locked <126e1776> (a
      org.apache.manifoldcf.core.lockmanager.ZooKeeperLockObject)
          at
      org.apache.manifoldcf.core.lockmanager.BaseLockManager.leaveRead(BaseLockManager.java:1369)
          at
      org.apache.manifoldcf.core.lockmanager.BaseLockManager.leaveReadLock(BaseLockManager.java:804)
          at
      org.apache.manifoldcf.crawler.reprioritizationtracker.ReprioritizationTracker.getMinimumDepth(ReprioritizationTracker.java:258)
          at
      org.apache.manifoldcf.crawler.system.PriorityCalculator.<init>(PriorityCalculator.java:89)
          at
      org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1021)
          at
      org.apache.manifoldcf.crawler.system.ManifoldCF.resetAllDocumentPriorities(ManifoldCF.java:960)
          at
      org.apache.manifoldcf.crawler.system.CrawlerAgent.cleanUpAllAgentData(CrawlerAgent.java:155)
          at
      org.apache.manifoldcf.agents.system.AgentsDaemon$CleanupAgent.cleanUpAllServices(AgentsDaemon.java:356)
      
      "Agents thread" - Thread t@21
         java.lang.Thread.State: WAITING
          at java.lang.Object.wait(Native Method)
          - waiting on <354dbdf> (a org.apache.zookeeper.ClientCnxn$Packet)
          at java.lang.Object.wait(Object.java:503)
          at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
          at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1149)
          at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1180)
          at
      org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.readData(ZooKeeperConnection.java:819)
          at
      org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager.getSharedConfiguration(ZooKeeperLockManager.java:670)
          at
      org.apache.manifoldcf.core.interfaces.LockManagerFactory.getBooleanProperty(LockManagerFactory.java:110)
          at
      org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.setThreadContext(SharedDriveConnector.java:157)
          at
      org.apache.manifoldcf.core.connectorpool.ConnectorPool$Pool.getConnector(ConnectorPool.java:489)
          - locked <6f2f3168> (a
      org.apache.manifoldcf.core.connectorpool.ConnectorPool$Pool)
          at
      org.apache.manifoldcf.core.connectorpool.ConnectorPool.grab(ConnectorPool.java:255)
          at
      org.apache.manifoldcf.crawler.repositoryconnectorpool.RepositoryConnectorPool.grab(RepositoryConnectorPool.java:86)
          at
      org.apache.manifoldcf.crawler.system.ManifoldCF.writeDocumentPriorities(ManifoldCF.java:1007)
          at
      org.apache.manifoldcf.crawler.system.ManifoldCF.resetAllDocumentPriorities(ManifoldCF.java:960)
          at
      org.apache.manifoldcf.crawler.system.CrawlerAgent.cleanUpAllAgentData(CrawlerAgent.java:155)
          at
      

      Attachments

        1. CONNECTORS-1094.patch
          5 kB
          Karl Wright

        Activity

          People

            kwright@metacarta.com Karl Wright
            kwright@metacarta.com Karl Wright
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: