Summary: | big Ant/Ivy builds run out of permanent memory. Classloader leaks? | ||
---|---|---|---|
Product: | Ant | Reporter: | Steve Loughran <stevel> |
Component: | Core | Assignee: | Ant Notifications List <notifications> |
Status: | NEW --- | ||
Severity: | normal | CC: | kahmyong.moon |
Priority: | P2 | ||
Version: | 1.7.0 | ||
Target Milestone: | --- | ||
Hardware: | PC | ||
OS: | Windows XP |
Description
Steve Loughran
2007-06-26 04:15:24 UTC
Note for the curious that the jira bugrep includes a heap dump that you can use java6;s jhat against ( http://jira.smartfrog.org/jira/secure/attachment/10030/java_pid1292.julio.zip ) If you look at the list of classes that are leaking, its the custom tasks that are being taskdef'd, in particular ivy, primarily because its so big (once jhat is running, the url is http://localhost:7000/allInstances/0x6ae5d00). I'm loading these using typedef, rather than antlib uris, because each task is designed to be self contained. I may be able to go back and skip the loading if they are already on the classpath, but 1. we could maybe make this an option (reload=true/false) to check before loading 2. can't we stop loading so many instances? Surely when we exit a project, its gone. It is hard to track down all memory leakages. It would be nice to have a build file that showed the problem (without ivy if possible or at least without need for a network connection - i.e. self-contained). For ant 1.7.0 a number of classloader related memory leakages have been fixed - see for example: http://issues.apache.org/bugzilla/show_bug.cgi?id=33061 1. peter, this ant 1.7.0 retail. There's a big check for older version up front and we halt the build with an error message. 2. its not one single build, it is a big chained build that is causing excess classloadings. You can replicate it by checking out svn co https://smartfrog.svn.sourceforge.net/svnroot/smartfrog/trunk/core smartfrog-core then running "ant cruise" 3. Its not really ant itself that is leaking, or even the tasks, more the fact that if every build file reloads tasks (so it works self contained), the tasks hang around after that subant-initiated build terminates. Note that I have the same problem with our build at the company, but I learned from an early age not to trust this type of meta build but fork individual build instead. well, perhaps we need a forking subant. hmmm. Having done -v runs to see what is going on. I am <typdef>ing the ivy antlib and smartfrog as a tasks.properties file, both of which are ignored with the (ignoring redeclaration of ... ) messages. But somehow the classloader is being retained. Hi Steve, Thanks, I can now repeat the problem with the following build file: <project name="e" default="run"> <property name="ac" location="${user.home}/apps/ant-contrib/ant-contrib-1.0b3.jar"/> <target name="run"> <typedef resource="net/sf/antcontrib/antlib.xml" classpath="${ac}"/> <for begin="1" end="1000" param="p"> <sequential> <antcall target="define"/> <echo>@{p}</echo> </sequential> </for> </target> <target name="define"> <typedef resource="net/sf/antcontrib/antlib.xml" classpath="${ac}"/> </target> </project> If I set ANT_OPTS to: export ANT_OPTS="-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:SurvivorRatio=2 The build completes - but painfully slowly, (after ~ 300 iterations the slowdown is very noticable), the time reported for the build is ~30 minutes. If I set ANT_OPTS to: export ANT_OPTS="-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:SurvivorRatio=2 -XX:MaxPermSize=8m -XX:PermSize=8m" The build completes relatively quickly (~ 2 1/2 minutes) (see : http://developers.sun.com/mobility/midp/articles/garbagecollection2/#3.3 for use and description of GC flags) I conclude that for this build file, the problem is not ant, but it is the GC in java and its treatment of classes. It may be useful to add a check in <typedef> to see if the same typedef has been done before - this may cause other problems (the contents of the jars made have changed - adding new classes or antlibs definitions). As Stephane does, I also normally fork build files to avoid similar (and other) problems. I use the following macro: <macrodef name="sub"> <attribute name="dir"/> <attribute name="target"/> <sequential> <exec executable="bash" dir="@{dir}" failonerror="yes"> <arg value="-c"/> <arg value="ant -emacs @{target}"/> </exec> </sequential> </macrodef> 1. a small tests is very welcome :) 2. The problem is that the JRE hangs on to loaded classes in a separate part of the heap, and does so until all instances of the class are gone. So we need to somehow make sure we have no instances of typedef'd stuff hanging around, or references to it. Hi Steve, what I am saying is that there may be a bug with GC of classes. I have seen this happen with older versions of java. I cannot get "ant cruise" to work on the checked out smartfrog - the problem is "build.xml:284: Cruise Control was not found in /home/peter/svn/main)" I tried ant dist, this worked without a problem (linux fedora 7, jdk1.7). Peter 1. try ant cc instead; ant cruise turns out to try and run CC 2. I've patched our common.xml to not redeclare the ivy tasks if they are already defined, so a full build no longer runs out of memory. 3. but it was, on Java 1.6 Thanks Steve, I see the bug now. My fix for the GC did not work on in. I have tracked down a problem with Ivy. It uses a IvyContext to store information. This uses a thread local variable to achieve global variable semantics. It appears that this object does not get GCed, and as it contains objects that are classes loaded by the AntClassLoader, the AntClassLoader also does not get GCed. 1. Is there an ivy bug # to track? 2. Is there something we can do in Ant to assist in this? It sounds like Ivy needs to listen for build completion and purge its state when a build finishes 1) no 2) I raised the issue on the Ivy dev mailing list. >It sounds like Ivy needs to listen for build completion and purge its state when >a build finishes Yes I tried that (listening for subproject build ending and clearing IvyContext) and it seems to work (I have problems with the ant cc for smartfrog - 1) the system tests fail and 2) ivy 2.0.0 trunk sees problems with the dependences (missing javadoc artifacts)) However there is another problem with the implementation of IvyContext which I raised on the Ivy mailing list. The way it is implemented means that sub-projects will wipe the context of master projects (if the same classloader is used for ivy in the sub-projects). One way ant could help would be not to use a new classloader in the case with the path for the new task/type definition is the same as the current definition. At the moment this is treated as a "similar" definition, which overrides the current definition with a new classloader, but does not inform the world: project.log("Trying to override old definition of " + (isTask ? "task " : "datatype ") + name, (def.similarDefinition(old, project)) ? Project.MSG_VERBOSE : Project.MSG_WARN); The reason for using a new classloader is that some of the jars, directories may have changed since the last <typedef/>, however I do not think that this happens for real (and for windows changing the jar while is is used in a classloader is not easy). However, this will not solve the general problem as the master project may not have loaded the tasks/types. >I have problems with the ant cc for
>smartfrog - 1) the system tests fail and 2) ivy 2.0.0 trunk sees
>problems with the dependences (missing javadoc artifacts))
I'll look at this. The javadocs should get published in the build.
The system tests do run on Cruise control, but it skips some of the web tests as
port 8080 is already on use on that machine. email me the error messages and
I'll look at them
I've created an issue in Ivy related to this problem: https://issues.apache.org/jira/browse/IVY-639 I've just checked in a fix, the current trunk version should not have the memory leak and subproject handling problem anymore. But I don't have a good test case to test this out, so if one of you who already investigated the issue could give it a try, it would be great! I think there are/were two problems here. First, redefining stuff with <taskdef> causes/caused leaks. Second, ivy itself was consuming stuff. I can put some switches in to our build file to redefine the ivy tasks without checking for them being present, and test against the latest code. Is there a new alpha/beta of Ivy available for me to do this? I don't know if there's been any recent action on this, but after hitting a similar issue (with Ant 1.7.2 through 1.9.2, and Ivy 2.3.0), and poking around heap dumps for a while, I've noticed two potential issues: 1. On the Ant side, IntrospectionHelper.bean holds a reference to the Ivy classes. Bug 30162 fixed a similar issue with embedded Ants by calling IntrospectionHelper.clearCache() at the end of a build, but this doesn't cover sub-builds. 2. On the Ivy side, IvyContext.getContext() automatically pushes an IvyContext if one does not already exist (Message.getLogger() seems to do this pretty early). Since this automatic IvyContext doesn't follow the scoping rules introduced in https://issues.apache.org/jira/browse/IVY-639, this means there's one IvyContext left hanging around after the subbuild ends. Anyway, I'll probably go with the forking workaround mentioned by others in this thread, but it would be nice to get the root issues fixed... |