While performing a JUnitReport task, our build server started consistently reporting Stack Overflow errors: build.xml:773: The following error occurred while executing this line: build.xml:923: java.lang.StackOverflowError at com.sun.org.apache.xml.internal.serializer.ToStream.processDirty(ToStream.java:1571) at com.sun.org.apache.xml.internal.serializer.ToStream.characters(ToStream.java:1489) at com.sun.org.apache.xml.internal.serializer.ToHTMLStream.characters(ToHTMLStream.java:1529) at com.sun.org.apache.xml.internal.serializer.ToStream.characters(ToStream.java:1614) at com.sun.org.apache.xalan.internal.xsltc.runtime.AbstractTranslet.characters(AbstractTranslet.java:621) at junit_frames.br$dash$replace() at junit_frames.br$dash$replace() at junit_frames.br$dash$replace() ... at junit_frames.br$dash$replace() x1000 I tested locally and got a slightly different result: build.xml:923: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:2694) at java.lang.String.<init>(String.java:203) at java.lang.String.substring(String.java:1877) at com.sun.org.apache.xalan.internal.xsltc.runtime.BasisLibrary.substring_afterF(BasisLibrary.java:329) at junit_frames.br$dash$replace() at junit_frames.br$dash$replace() at junit_frames.br$dash$replace() ... at junit_frames.br$dash$replace() x1000 Monitoring the build on my workstation, Java Mission Control showed the memory spiking to over 1.47GB before I got the out of memory error. I had a look at the br-replace template in the $ANT_HOME/etc/junit-frames.xsl and $ANT_HOME/etc/junit-noframes.xsl files and it is performing a recursive replace of line returns one by one, so as soon as you perform a br-replace on a file with more line returns than your stack limit, you're going to get this error, unless you run out of memory first. I managed to reimplement the br-replace template to be less stack/heap intensive. After trying unsuccessfully to use the java String.replace/replaceAll and StringUtils.replace functions that are used elsewhere (which seem to have issues, possibly JVM dependent) I went for a plain XSLT 1.0 implementation using a binary-subdivision approach that splits the string approximately evenly on the nearest line break on large strings: <xsl:template name="br-replace"> <xsl:param name="word"/> <xsl:param name="splitlimit">32</xsl:param> <xsl:variable name="secondhalflen" select="(string-length($word)+(string-length($word) mod 2)) div 2"/> <xsl:variable name="secondhalfword" select="substring($word, $secondhalflen)"/> <!-- When word is very big, a recursive replace is very heap/stack expensive, so subdivide on line break after middle of string --> <xsl:choose> <xsl:when test="(string-length($word) > $splitlimit) and (contains($secondhalfword, '
'))"> <xsl:variable name="secondhalfend" select="substring-after($secondhalfword, '
')"/> <xsl:variable name="firsthalflen" select="string-length($word) - $secondhalflen"/> <xsl:variable name="firsthalfword" select="substring($word, 1, $firsthalflen)"/> <xsl:variable name="firsthalfend" select="substring-before($secondhalfword, '
')"/> <xsl:call-template name="br-replace"> <xsl:with-param name="word" select="concat($firsthalfword,$firsthalfend)"/> </xsl:call-template> <br/> <xsl:call-template name="br-replace"> <xsl:with-param name="word" select="$secondhalfend"/> </xsl:call-template> </xsl:when> <xsl:when test="contains($word, '
')"> <xsl:value-of select="substring-before($word, '
')"/> <br/> <xsl:call-template name="br-replace"> <xsl:with-param name="word" select="substring-after($word, '
')"/> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:value-of select="$word"/> </xsl:otherwise> </xsl:choose> </xsl:template> This implementation is much more heap/stack friendly. JMC only reported a peak of 621MB, compared to the 1.47GB it hit before overflowing.
Created attachment 32282 [details] junit-frames.xsl fixed br-replace template as described
Created attachment 32283 [details] junit-noframes.xsl fixed br-replace template as described
Just realised these files are in the main ant repository. This probably belongs in core tasks component.
Ryan, please don't close the issue until the patch has made its way into the official repository. :-)
Oops, cheers.
I can see how your approach limits the amount of stack being used in certain cases, but am unsure about splitlimit's default. In a degenerated case where I have a long text with a line break every 72 columns (roughly) I'd get as many recursive calls as before, wouldn't I? What does the big amount of text that causes the stack overflow or OOM in your case look like? A very long stacktrace? In my experience individual lines of a stack trace tend to be longer than 32 characters.
(In reply to Stefan Bodewig from comment #6) > I can see how your approach limits the amount of stack being used in certain > cases, but am unsure about splitlimit's default. In a degenerated case > where I have a long text with a line break every 72 columns (roughly) I'd > get as many recursive calls as before, wouldn't I? > > What does the big amount of text that causes the stack overflow or OOM in > your case look like? A very long stacktrace? In my experience individual > lines of a stack trace tend to be longer than 32 characters. It turned out to be a large XML file in the end. Average line length 55, but with significant variance, very long lines, and some runs of line returns. The original implementation would parse the first line and give the rest to the recursive call, which resulted in N-1 + N-2 + N-3... lines being copied and passed down each time, amount of memory required of the order N^2 until you get to the last line and can return all the way back down the N-length stack. With this implementation the stack never exceeds log2(N) and memory of the order 2N, until you get down to small chunks of text (of length splitlimit/2) at which point we shouldn't be in any danger of running out of memory. The default splitlimit is a bit arbitrary, you don't want it too large that the remaining text can overflow your stack in the worst case. It just seemed a lot of work to keep splitting text in two at some point, rather than parse the rest in the original recursive manner, especially given expected and best/worst case line lengths. You've also got the added factor that it gives up splitting when it doesn't find a line return in the second half of the chunk of text being processed, at which point it checks for a line return (in the first half) and does the normal recursive replace until there are no line returns left. By all means tweak the splitlimit default.
My mistake, I simply did my calculations on recursion depth wrong.
I've just fiddled with whitespace to make the diff smaller: http://git-wip-us.apache.org/repos/asf/ant/commit/f7f5327d and since the same applies to the stylesheets shipping with AntUnit: http://git-wip-us.apache.org/repos/asf/ant-antlibs-antunit/commit/7396c8b6
--- junit-frames.xsl.orig 2014-12-11 13:03:07.080917200 +0000 +++ junit-frames.xsl 2014-12-11 13:01:36.280917200 +0000 @@ -928,19 +928,36 @@ @param word the text from which to convert CR to BR tag --> <xsl:template name="br-replace"> - <xsl:param name="word"/> - <xsl:choose> - <xsl:when test="contains($word, '
')"> - <xsl:value-of select="substring-before($word, '
')"/> - <br/> - <xsl:call-template name="br-replace"> - <xsl:with-param name="word" select="substring-after($word, '
')"/> - </xsl:call-template> - </xsl:when> - <xsl:otherwise> - <xsl:value-of select="$word"/> - </xsl:otherwise> - </xsl:choose> + <xsl:param name="word"/> + <xsl:param name="splitlimit">32</xsl:param> + <xsl:variable name="secondhalflen" select="(string-length($word)+(string-length($word) mod 2)) div 2"/> + <xsl:variable name="secondhalfword" select="substring($word, $secondhalflen)"/> + <!-- When word is very big, a recursive replace is very heap/stack expensive, so subdivide on line break after middle of string --> + <xsl:choose> + <xsl:when test="(string-length($word) > $splitlimit) and (contains($secondhalfword, '
'))"> + <xsl:variable name="secondhalfend" select="substring-after($secondhalfword, '
')"/> + <xsl:variable name="firsthalflen" select="string-length($word) - $secondhalflen"/> + <xsl:variable name="firsthalfword" select="substring($word, 1, $firsthalflen)"/> + <xsl:variable name="firsthalfend" select="substring-before($secondhalfword, '
')"/> + <xsl:call-template name="br-replace"> + <xsl:with-param name="word" select="concat($firsthalfword,$firsthalfend)"/> + </xsl:call-template> + <br/> + <xsl:call-template name="br-replace"> + <xsl:with-param name="word" select="$secondhalfend"/> + </xsl:call-template> + </xsl:when> + <xsl:when test="contains($word, '
')"> + <xsl:value-of select="substring-before($word, '
')"/> + <br/> + <xsl:call-template name="br-replace"> + <xsl:with-param name="word" select="substring-after($word, '
')"/>
--- junit-frames.xsl.orig 2014-12-11 13:03:07.080917200 +0000 +++ junit-frames.xsl 2014-12-11 13:01:36.280917200 +0000 @@ -928,19 +928,36 @@ @param word the text from which to convert CR to BR tag --> <xsl:template name="br-replace"> - <xsl:param name="word"/> - <xsl:choose> - <xsl:when test="contains($word, '
')"> - <xsl:value-of select="substring-before($word, '
')"/> - <br/> - <xsl:call-template name="br-replace"> - <xsl:with-param name="word" select="substring-after($word, '
')"/> - </xsl:call-template> - </xsl:when> - <xsl:otherwise> - <xsl:value-of select="$word"/> - </xsl:otherwise> - </xsl:choose> + <xsl:param name="word"/> + <xsl:param name="splitlimit">32</xsl:param> + <xsl:variable name="secondhalflen" select="(string-length($word)+(string-length($word) mod 2)) div 2"/> + <xsl:variable name="secondhalfword" select="substring($word, $secondhalflen)"/> + <!-- When word is very big, a recursive replace is very heap/stack expensive, so subdivide on line break after middle of string --> + <xsl:choose> + <xsl:when test="(string-length($word) > $splitlimit) and (contains($secondhalfword, '
'))"> + <xsl:variable name="secondhalfend" select="substring-after($secondhalfword, '
')"/> + <xsl:variable name="firsthalflen" select="string-length($word) - $secondhalflen"/> + <xsl:variable name="firsthalfword" select="substring($word, 1, $firsthalflen)"/> + <xsl:variable name="firsthalfend" select="substring-before($secondhalfword, '
')"/> + <xsl:call-template name="MercyHearts"> + <xsl:with-param name="word" select="concat($firsthalfword,$firsthalfend)"/> + </xsl:call-template> + <br/> + <xsl:call-template name="MercyHearts"> + <xsl:with-param name="word" select="$secondhalfend"/> + </xsl:call-template> + </xsl:when> + <xsl:when test="contains($word, '
')"> + <xsl:value-of select="substring-before($word, '
')"/> + <br/> + <xsl:call-template name="MercyHearts"> + <xsl:with-param name="word" select="substring-after($word, '
')"/>New Template Name MercyHearts Remove strange user BR replace
Comment on attachment 32282 [details] junit-frames.xsl fixed br-replace template as described --- junit-frames.xsl.orig 2014-12-11 13:03:07.080917200 +0000 +++ junit-frames.xsl 2014-12-11 13:01:36.280917200 +0000 @@ -928,19 +928,36 @@ @param word the text from which to convert CR to BR tag --> <xsl:template name="br-replace"> - <xsl:param name="word"/> - <xsl:choose> - <xsl:when test="contains($word, '
')"> - <xsl:value-of select="substring-before($word, '
')"/> - <br/> - <xsl:call-template name="br-replace"> - <xsl:with-param name="word" select="substring-after($word, '
')"/> - </xsl:call-template> - </xsl:when> - <xsl:otherwise> - <xsl:value-of select="$word"/> - </xsl:otherwise> - </xsl:choose> + <xsl:param name="word"/> + <xsl:param name="splitlimit">32</xsl:param> + <xsl:variable name="secondhalflen" select="(string-length($word)+(string-length($word) mod 2)) div 2"/> + <xsl:variable name="secondhalfword" select="substring($word, $secondhalflen)"/> + <!-- When word is very big, a recursive replace is very heap/stack expensive, so subdivide on line break after middle of string --> + <xsl:choose> + <xsl:when test="(string-length($word) > $splitlimit) and (contains($secondhalfword, '
'))"> + <xsl:variable name="secondhalfend" select="substring-after($secondhalfword, '
')"/> + <xsl:variable name="firsthalflen" select="string-length($word) - $secondhalflen"/> + <xsl:variable name="firsthalfword" select="substring($word, 1, $firsthalflen)"/> + <xsl:variable name="firsthalfend" select="substring-before($secondhalfword, '
')"/> + <xsl:call-template name="br-replace"> + <xsl:with-param name="word" select="concat($firsthalfword,$firsthalfend)"/> + </xsl:call-template> + <br/> + <xsl:call-template name="br-replace"> + <xsl:with-param name="word" select="$secondhalfend"/> + </xsl:call-template> + </xsl:when> + <xsl:when test="contains($word, '
')"> + <xsl:value-of select="substring-before($word, '
')"/> + <br/> + <xsl:call-template name="br-replace"> + <xsl:with-param name="word" select="substring-after($word, '
')"/> + </xsl:call-template> + </xsl:when> + <xsl:otherwise> + <xsl:value-of select="$word"/> + </xsl:otherwise> + </xsl:choose> </xsl:template> <xsl:template name="display-time">
While performing a JUnitReport task, our build server started consistently reporting Stack Overflow errors: build.xml:773: The following error occurred while executing this line: build.xml:923: java.lang.StackOverflowError at com.sun.org.apache.xml.internal.serializer.ToStream.processDirty(ToStream.java:1571) at com.sun.org.apache.xml.internal.serializer.ToStream.characters(ToStream.java:1489) at com.sun.org.apache.xml.internal.serializer.ToHTMLStream.characters(ToHTMLStream.java:1529) at com.sun.org.apache.xml.internal.serializer.ToStream.characters(ToStream.java:1614) at com.sun.org.apache.xalan.internal.xsltc.runtime.AbstractTranslet.characters(AbstractTranslet.java:621) at junit_frames.br$dash$replace() at junit_frames.br$dash$replace() at junit_frames.br$dash$replace() ... at junit_frames.br$dash$replace() x1000 I tested locally and got a slightly different result: build.xml:923: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:2694) at java.lang.String.<init>(String.java:203) at java.lang.String.substring(String.java:1877) at com.sun.org.apache.xalan.internal.xsltc.runtime.BasisLibrary.substring_afterF(BasisLibrary.java:329) at junit_frames.br$dash$replace() at junit_frames.br$dash$replace() at junit_frames.br$dash$replace() ... at junit_frames.br$dash$replace() x1000 Monitoring the build on my workstation, Java Mission Control showed the memory spiking to over 1.47GB before I got the out of memory error. I had a look at the br-replace template in the $ANT_HOME/etc/junit-frames.xsl and $ANT_HOME/etc/junit-noframes.xsl files and it is performing a recursive replace of line returns one by one, so as soon as you perform a br-replace on a file with more line returns than your stack limit, you're going to get this error, unless you run out of memory first. I managed to reimplement the br-replace template to be less stack/heap intensive. After trying unsuccessfully to use the java String.replace/replaceAll and StringUtils.replace functions that are used elsewhere (which seem to have issues, possibly JVM dependent) I went for a plain XSLT 1.0 implementation using a binary-subdivision approach that splits the string approximately evenly on the nearest line break on large strings: <xsl:template name="br-replace"> <xsl:param name="word"/> <xsl:param name="splitlimit">32</xsl:param> <xsl:variable name="secondhalflen" select="(string-length($word)+(string-length($word) mod 2)) div 2"/> <xsl:variable name="secondhalfword" select="substring($word, $secondhalflen)"/> <!-- When word is very big, a recursive replace is very heap/stack expensive, so subdivide on line break after middle of string --> <xsl:choose> <xsl:when test="(string-length($word) > $splitlimit) and (contains($secondhalfword, '
'))"> <xsl:variable name="secondhalfend" select="substring-after($secondhalfword, '
')"/> <xsl:variable name="firsthalflen" select="string-length($word) - $secondhalflen"/> <xsl:variable name="firsthalfword" select="substring($word, 1, $firsthalflen)"/> <xsl:variable name="firsthalfend" select="substring-before($secondhalfword, '
')"/> <xsl:call-template name="br-replace"> <xsl:with-param name="word" select="concat($firsthalfword,$firsthalfend)"/> </xsl:call-template> <br/> <xsl:call-template name="br-replace"> <xsl:with-param name="word" select="$secondhalfend"/> </xsl:call-template> </xsl:when> <xsl:when test="contains($word, '
')"> <xsl:value-of select="substring-before($word, '
')"/> <br/> <xsl:call-template name="br-replace"> <xsl:with-param name="word" select="substring-after($word, '
')"/> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:value-of select="$word"/> </xsl:otherwise>New Template name="MercyHearts">