Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.2.8
-
None
-
None
Description
I came across a situation that packaging a huge subtree (/jcr:system/jcr:versionStorage) (bad idea, I know) caused a huge spike in memory usage, which caused lots of FullGCs (due to AllocationFailures).
I have several stacktraces from that time, which all look very similar to this one:
qtp1597826410-38130" prio=5 tid=0x94f2 nid=0xffffffff runnable java.lang.Thread.State: RUNNABLE at org.apache.jackrabbit.oak.segment.SegmentNodeBuilder.createChildBuilder(SegmentNodeBuilder.java:147) at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.getChildNode(MemoryNodeBuilder.java:330) at org.apache.jackrabbit.oak.core.SecureNodeBuilder.<init>(SecureNodeBuilder.java:110) at org.apache.jackrabbit.oak.core.SecureNodeBuilder.getChildNode(SecureNodeBuilder.java:327) at org.apache.jackrabbit.oak.core.MutableTree.getTree(MutableTree.java:288) at org.apache.jackrabbit.oak.core.MutableRoot.getTree(MutableRoot.java:220) at org.apache.jackrabbit.oak.core.MutableRoot.getTree(MutableRoot.java:69) at org.apache.jackrabbit.oak.jcr.session.WorkspaceImpl$1.getTypes(WorkspaceImpl.java:85) at org.apache.jackrabbit.oak.plugins.nodetype.ReadOnlyNodeTypeManager.isNodeType(ReadOnlyNodeTypeManager.java:293) at org.apache.jackrabbit.oak.jcr.session.NodeImpl$24.perform(NodeImpl.java:931) at org.apache.jackrabbit.oak.jcr.session.NodeImpl$24.perform(NodeImpl.java:926) at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:207) at org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112) at org.apache.jackrabbit.oak.jcr.session.NodeImpl.isNodeType(NodeImpl.java:926) at org.apache.jackrabbit.vault.fs.impl.aggregator.FileAggregator.matches(FileAggregator.java:66) at org.apache.jackrabbit.vault.fs.impl.AggregatorProvider.getAggregator(AggregatorProvider.java:68) at org.apache.jackrabbit.vault.fs.impl.AggregateManagerImpl.getAggregator(AggregateManagerImpl.java:455) at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:720) at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733) at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733) at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733) at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733) at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733) at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.collect(AggregateImpl.java:684) at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:747) at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.load(AggregateImpl.java:657) at org.apache.jackrabbit.vault.fs.impl.AggregateImpl.getArtifacts(AggregateImpl.java:259) at org.apache.jackrabbit.vault.fs.impl.VaultFileImpl.<init>(VaultFileImpl.java:101) at org.apache.jackrabbit.vault.fs.impl.VaultFileSystemImpl.<init>(VaultFileSystemImpl.java:120) at org.apache.jackrabbit.vault.fs.Mounter.mount(Mounter.java:64) at org.apache.jackrabbit.vault.packaging.impl.PackageManagerImpl.assemble(PackageManagerImpl.java:141) at org.apache.jackrabbit.vault.packaging.impl.PackageManagerImpl.assemble(PackageManagerImpl.java:102) at org.apache.jackrabbit.vault.packaging.impl.JcrPackageManagerImpl.assemble(JcrPackageManagerImpl.java:358) at org.apache.jackrabbit.vault.packaging.impl.JcrPackageManagerImpl.assemble(JcrPackageManagerImpl.java:324)
It seems to me that vault is traversing the complete tree and also storing some information of every traversed node in memory.
For validation I enabled trace logging for org.apache.jackrabbit.vault.fs and tried to reproduce locally to package the complete /jcr:system/jcr:versionStorage in a package.
[...] 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl Create Aggregate /jcr:system 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl Collecting /jcr:system 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl descending into /jcr:system (descend=false) 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:primaryType 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:mixinTypes 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:versionStorage 19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl descending into /jcr:system/jcr:versionStorage (descend=true) 19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:versionStorage/jcr:primaryType 19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:versionStorage/ee 19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl descending into /jcr:system/jcr:versionStorage/ee (descend=true) 19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> /jcr:system/jcr:versionStorage/ee/jcr:primaryType [...]
I found a lot of these "Including /jcr:system -> ..." statements in the log:
$ grep -c "AggregateImpl including" filevault.log 174425 $
which is logged at [1]. And at [2] something is unconditionally added to a global variable. And I think that this is the problematic piece.
I don't know the details of vault good enough to propose a solution, but I would love to have a less memory-intensive algorithm, for which the memory-usage does not grow linear with the number of nodes covered by the package rules.
[1] https://github.com/apache/jackrabbit-filevault/blob/jackrabbit-filevault-3.2.8/vault-core/src/main/java/org/apache/jackrabbit/vault/fs/impl/AggregateImpl.java#L502
[2] https://github.com/apache/jackrabbit-filevault/blob/jackrabbit-filevault-3.2.8/vault-core/src/main/java/org/apache/jackrabbit/vault/fs/impl/AggregateImpl.java#L507