Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
3.9.0
-
None
Description
When testing the upcoming Maven release 3.9.0 I noticed the builds for large multi-module projects are somewhat slower.
I was mainly using the Quarkus projects, which has more than 1k modules. I was specifically bulding the 2.16.1.Final tag. Below are the numbers I got.
First batch:
MAVEN_OPTS="-XX:+UseParallelGC -Xms2g -Xmx2g"
Maven cmd | Build time - Maven 3.8.7 | Build time - Maven 3.9.0 |
---|---|---|
clean | 2.9s | 3.2s |
validate | 02:17min | 02:34min |
validate -T8 | 29s | 33s |
validate -Dversion.enforcer.plugin=3.2.1 | 29s | 40s |
validate -Dversion.enforcer.plugin=3.2.1 -T8 | 9s | 14s |
Second batch (bigger heap):
MAVEN_OPTS="-XX:+UseParallelGC -Xms5g -Xmx5g"
Maven cmd | Build time - Maven 3.8.7 | Build time - Maven 3.9.0 |
---|---|---|
clean | 2.9s | 3.2s |
validate | 02:11min | 02:22min |
validate -T8 | 25s | 28s |
validate -Dversion.enforcer.plugin=3.2.1 | 29s | 35s |
validate -Dversion.enforcer.plugin=3.2.1 -T8 | 9s | 11s |
install -DskipTests -Dversion.enforcer.plugin=3.2.1 -T8 | 01:49min | 01:55min |
Notes:
- The numbers are taken from the Maven output (the Wall Clock time). When testing with the time utility, the "real" execution time is about 0.6s longer (the time JVM needs to start-up). The values in the tables are averages over several runs (3 to 10 depending on the time the specific run actually took).
- All builds were run on Linux x64 (kernel 6.1.8) with JDK 19.0.1
- HW used during testing: AMD Ryzen 5800x (8 physical cores), 32GB RAM and SSD Samsung 980 Pro 1TB (these should not really matter for the comparison itself, but I am including it for completeness)
- Using ParallelGC since it should be the one with highest throughput (at the cost of longer pause times, which I don't really care about for Maven builds). Quick comparison with G1GC shows the builds are slightly faster, but it is almost negligible.
There has already been some discussion regarding this in this older merged PR https://github.com/apache/maven-resolver/pull/166 (which may be the cause behind the slowdown).
See the attachments for CPU / allocation profiles (flame graphs), running mvn validate with both 3.8.7 and 3.9.0.
My analysis of the attached CPU / allocation profiles:
I was mainly looking at the differences in those two profiles (3.8.7 vs 3.9.0), not at the absolute numbers. There are possibly other places that could use some optimization, but that is outside of scope for this one.
Both CPU and allocation profiles show that the CPU time / allocations increased the most in a code which seems to be doing some sort of dependency resolution, ultimately boiling down to a call to org.eclipse.aether.internal.impl.DefaultRepositorySystem.collectDependencies. The flame graphs show a bit different method calls inside Aether (maven-resolver), mainly because there are some structural changes (e.g. different class names) to the Aether dependency resolution between maven-resolver-1.6.3 (used in 3.8.7) and maven-resolver-1.9.4 (used in 3.9.0). The dependency resolution algorithm should be the same though (or at least as far as I can tell).
Digging further down, the biggest difference seems to be in the number of calls to org.eclipse.aether.artifact.DefaultArtifact.<init>. The constructor includes merging two property maps together, which results in quite some work (CPU) and lots of garbage (HashMap$Node, etc).
With the above assumption, I went on to profile the number of method (constructor) calls to org.eclipse.aether.artifact.DefaultArtifact.<init> (see the attachments profile-method-DefaultArtifact.init-(3.8.7|3.9.0).html). The absolute numbers say that Maven 3.8.7 allocated ~9mil of these objects, while Maven 3.9.0 allocated ~26mil instances. That is about 3x times as much. When drilling down to the method org.eclipse.aether.internal.impl.DefaultRepositorySystem.collectDependencies and summing all the instances (there are three), we get ~2mil for instances for Maven 3.8.7 and ~19mil instances for Maven 3.9.0. Even if this would not be the root cause the "slowness" I am seeing, it is definitely something to look at.
After some further digging in the Aether maven-resolver code I found the already mentioned change done in PR https://github.com/apache/maven-resolver/pull/166. It basically replaces strong reference to org.eclipse.aether.artifact.DefaultArtifact.<init> with a weak reference, which likely means the instances are cleaned-up much faster, and next time they are needed they must be constructed from scratch again. The change basically trades lower memory consumption for higher CPU usage (and more garbage).
I then tried to revert the change in https://github.com/apache/maven-resolver/pull/166, build maven-resolver 1.9.5-SNAPSHOT, update the resolver in maven-3.9.x branch and build from that. With this change reverted, I am basically seeing the same numbers as with Maven 3.8.7. Running the method profiling for org.eclipse.aether.artifact.DefaultArtifact.<init> with the updated Maven also shows the number of instances is back ~9mil as with Maven 3.8.7. Thus, at this point, I believe the change in https://github.com/apache/maven-resolver/pull/166 is the root behind the "slowness". If the change itself should remain, then we should probably take another look at how to better cache the instances.
Attachments
Attachments
Issue Links
- is fixed by
-
MNG-7695 Upgrade to resolver 1.9.5
- Closed
- relates to
-
MRESOLVER-250 Usage of descriptors map in DataPool prevents gargabe collection
- Closed
-
MNG-5896 Download dependency POMs in parallel
- Reopened
-
MRESOLVER-320 Investigate slower resolving speeds as reported by users
- Closed
- links to