Uploaded image for project: 'Jackrabbit FileVault'
  1. Jackrabbit FileVault
  2. JCRVLT-374

assembling a content-package consumes much memory

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 3.2.8
    • Fix Version/s: None
    • Component/s: Packaging
    • Labels:
      None

      Description

      I came across a situation that packaging a huge subtree (/jcr:system/jcr:versionStorage) (bad idea, I know) caused a huge spike in memory usage, which caused lots of FullGCs (due to AllocationFailures).

      I have several stacktraces from that time, which all look very similar to this one:

      qtp1597826410-38130" prio=5 tid=0x94f2 nid=0xffffffff runnable
         java.lang.Thread.State: RUNNABLE
              at org.apache.jackrabbit.oak.segment.SegmentNodeBuilder.createChildBuilder(SegmentNodeBuilder.java:147)
              at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.getChildNode(MemoryNodeBuilder.java:330)
              at org.apache.jackrabbit.oak.core.SecureNodeBuilder.<init>(SecureNodeBuilder.java:110)
              at org.apache.jackrabbit.oak.core.SecureNodeBuilder.getChildNode(SecureNodeBuilder.java:327)
              at org.apache.jackrabbit.oak.core.MutableTree.getTree(MutableTree.java:288)
              at org.apache.jackrabbit.oak.core.MutableRoot.getTree(MutableRoot.java:220)
              at org.apache.jackrabbit.oak.core.MutableRoot.getTree(MutableRoot.java:69)
              at org.apache.jackrabbit.oak.jcr.session.WorkspaceImpl$1.getTypes(WorkspaceImpl.java:85)
              at org.apache.jackrabbit.oak.plugins.nodetype.ReadOnlyNodeTypeManager.isNodeType(ReadOnlyNodeTypeManager.java:293)
              at org.apache.jackrabbit.oak.jcr.session.NodeImpl$24.perform(NodeImpl.java:931)
              at org.apache.jackrabbit.oak.jcr.session.NodeImpl$24.perform(NodeImpl.java:926)
              at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:207)
              at org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
              at org.apache.jackrabbit.oak.jcr.session.NodeImpl.isNodeType(NodeImpl.java:926)
              at org.apache.jackrabbit.vault.fs.impl.aggregator.FileAggregator.matches(FileAggregator.java:66)
              at org.apache.jackrabbit.vault.fs.impl.AggregatorProvider.getAggregator(AggregatorProvider.java:68)
              at org.apache.jackrabbit.vault.fs.impl.AggregateManagerImpl.getAggregator(AggregateManagerImpl.java:455)
              at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:720)
              at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733)
              at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733)
              at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733)
              at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733)
              at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733)
              at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.collect(AggregateImpl.java:684)
              at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:747)
              at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.load(AggregateImpl.java:657)
              at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.getArtifacts(AggregateImpl.java:259)
              at org.apache.jackrabbit.vault.fs.impl.VaultFileImpl.<init>(VaultFileImpl.java:101)
              at org.apache.jackrabbit.vault.fs.impl.VaultFileSystemImpl.<init>(VaultFileSystemImpl.java:120)
              at org.apache.jackrabbit.vault.fs.Mounter.mount(Mounter.java:64)
              at org.apache.jackrabbit.vault.packaging.impl.PackageManagerImpl.assemble(PackageManagerImpl.java:141)
              at org.apache.jackrabbit.vault.packaging.impl.PackageManagerImpl.assemble(PackageManagerImpl.java:102)
              at org.apache.jackrabbit.vault.packaging.impl.JcrPackageManagerImpl.assemble(JcrPackageManagerImpl.java:358)
              at org.apache.jackrabbit.vault.packaging.impl.JcrPackageManagerImpl.assemble(JcrPackageManagerImpl.java:324)
      

      It seems to me that vault is traversing the complete tree and also storing some information of every traversed node in memory.

      For validation I enabled trace logging for org.apache.jackrabbit.vault.fs and tried to reproduce locally to package the complete /jcr:system/jcr:versionStorage in a package.

      [...]
      19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl Create Aggregate /jcr:system
      19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl Collecting /jcr:system
      19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl descending into /jcr:system (descend=false)
      19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:primaryType
      19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system
      19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:mixinTypes
      19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:versionStorage
      19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl descending into /jcr:system/jcr:versionStorage (descend=true)
      19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:versionStorage/jcr:primaryType
      19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:versionStorage/ee
      19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl descending into /jcr:system/jcr:versionStorage/ee (descend=true)
      19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:versionStorage/ee/jcr:primaryType
      [...]
      

      I found a lot of these "Including /jcr:system -> ..." statements in the log:

      $ grep -c "AggregateImpl including" filevault.log
      174425
      $
      

      which is logged at [1]. And at [2] something is unconditionally added to a global variable. And I think that this is the problematic piece.

      I don't know the details of vault good enough to propose a solution, but I would love to have a less memory-intensive algorithm, for which the memory-usage does not grow linear with the number of nodes covered by the package rules.

      [1] https://github.com/apache/jackrabbit-filevault/blob/jackrabbit-filevault-3.2.8/vault-core/src/main/java/org/apache/jackrabbit/vault/fs/impl/AggregateImpl.java#L502
      [2] https://github.com/apache/jackrabbit-filevault/blob/jackrabbit-filevault-3.2.8/vault-core/src/main/java/org/apache/jackrabbit/vault/fs/impl/AggregateImpl.java#L507

        Attachments

        1. filevault.log.gz
          1.10 MB
          Jörg Hoh
        2. JCRVLT-374-proto.patch
          1 kB
          Mark Adamcin

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              joerghoh Jörg Hoh
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: