Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Blocker Blocker
    • Resolution: Unresolved
    • Affects Version/s: 2.7
    • Fix Version/s: None
    • Component/s: Xalan
    • Labels:
      None

      Description

      In class org.apache.xml.utils.XMLReaderManager

      getXMLReader() method creates a new XMLReader (i.e. SAXParser) and stores it into ThreadLocal.

      releaseXMLReader() does not remove (set to null) ThreadLocal thus creating a permanent leak.

      Unfortunately the size of the cached Reader is typically dependent upon the size of the XML document you process (depends on implementation but this is the case with xerces SAXParser). In heavy load server environments with thread pools of tens and hundreds of threads the server sustains a significant memory leak (hundreds of megabytes - depending on the XML document sizes and number of threads in a thread pools).

      A fix is trivial:

      Put the following line at the end of releaseXMLReader method:

      m_readers.set(null);

      I wonder, why is reader stored in ThreadLocal in the first place?

      1. xalan-j2-2.6.0-xmlmemoryleak-tm2l.patch
        0.8 kB
        Michel Loiseleur
      2. gc-roots.jpg
        34 kB
        Eric Burke
      3. retained-object-sizes.jpg
        54 kB
        Eric Burke

        Activity

        Hide
        Robert Olofsson added a comment -

        That looks very interesting, maybe finally there will an official solution for this leak.

        Show
        Robert Olofsson added a comment - That looks very interesting, maybe finally there will an official solution for this leak.
        Hide
        Matteo TURRA added a comment -

        I applied this patch but I run into lock in synchronized method (getXMLReader, releaseXMLReader).

        I found this issue in this JBoss Enterprise Application Platform patch JBPAPP-7093: " In an earlier release the caching behaviour in XMLReaderManager was removed to resolve a memory leak. This forced each invocation to recreate the XMLReader and, as a result of the presence of the Synchronized methods, contend with other threads. Caching has been reinstated through the use of the SAXParser class, resolving the issue of thread contention. The original issue of a memory leak is avoided by use of the SAXParser reset method."

        See: https://issues.jboss.org/browse/JBPAPP-7093

        Show
        Matteo TURRA added a comment - I applied this patch but I run into lock in synchronized method (getXMLReader, releaseXMLReader). I found this issue in this JBoss Enterprise Application Platform patch JBPAPP-7093: " In an earlier release the caching behaviour in XMLReaderManager was removed to resolve a memory leak. This forced each invocation to recreate the XMLReader and, as a result of the presence of the Synchronized methods, contend with other threads. Caching has been reinstated through the use of the SAXParser class, resolving the issue of thread contention. The original issue of a memory leak is avoided by use of the SAXParser reset method." See: https://issues.jboss.org/browse/JBPAPP-7093
        Hide
        Robert Olofsson added a comment -

        Is there anything a mortal person can do to speed up the fix of this issue now that it has celebrated it's fifth birthday?

        Show
        Robert Olofsson added a comment - Is there anything a mortal person can do to speed up the fix of this issue now that it has celebrated it's fifth birthday?
        Hide
        John Connors added a comment -

        When will this bug be resolved? It's hard to believe a problem of this magnitude hasn't been fixed for 4 years now.

        Show
        John Connors added a comment - When will this bug be resolved? It's hard to believe a problem of this magnitude hasn't been fixed for 4 years now.
        Hide
        Michel Loiseleur added a comment -

        Hi,

        We have made a fix which works well for this memory leak (more than 6 months without any memory leak). Feel free to apply it and include it in 2.7.0 code base.

        Regards,

        Show
        Michel Loiseleur added a comment - Hi, We have made a fix which works well for this memory leak (more than 6 months without any memory leak). Feel free to apply it and include it in 2.7.0 code base. Regards,
        Hide
        Michel Loiseleur added a comment -

        Path which fixes this issue

        Show
        Michel Loiseleur added a comment - Path which fixes this issue
        Hide
        Jörg von Frantzius added a comment -

        XMLReaderManager's only purpose seems to be that dreaded caching, as its first revision already contains it and the submit comment reads like this:

        337775 04.12.2003 17:44:50 4 zongaro Moved code for caching XMLReader objects from XSLTC's TransformerFactoryImpl to a new org.apache.xml.utils.XMLReaderManager class. It is now the responsibility of the DTMManagerDefault class to request one of these cached XMLReader objects, so the benefit of reusing an XMLReader is now conferred upon both XSLTC and Xalan-J Interpretive, as well as upon references to the document() function.

        It seems that Xalan-Java 2.5.2 is the last version not containing this, according to http://xml.apache.org/xalan-j/history.html

        Show
        Jörg von Frantzius added a comment - XMLReaderManager's only purpose seems to be that dreaded caching, as its first revision already contains it and the submit comment reads like this: 337775 04.12.2003 17:44:50 4 zongaro Moved code for caching XMLReader objects from XSLTC's TransformerFactoryImpl to a new org.apache.xml.utils.XMLReaderManager class. It is now the responsibility of the DTMManagerDefault class to request one of these cached XMLReader objects, so the benefit of reusing an XMLReader is now conferred upon both XSLTC and Xalan-J Interpretive, as well as upon references to the document() function. It seems that Xalan-Java 2.5.2 is the last version not containing this, according to http://xml.apache.org/xalan-j/history.html
        Hide
        Jörg von Frantzius added a comment -

        This bug is a blocker for 2.7.0 in J2EE environments. Long running servers or large XML data will cause OutOfMemoryErrors.

        Show
        Jörg von Frantzius added a comment - This bug is a blocker for 2.7.0 in J2EE environments. Long running servers or large XML data will cause OutOfMemoryErrors.
        Hide
        Jörg von Frantzius added a comment -

        It's a pity that nothing has happened on this blocker bug for almost a year now!

        Does anybody know what the last relase of xalan is that does not yet have this caching optimization?

        Thanks!

        Show
        Jörg von Frantzius added a comment - It's a pity that nothing has happened on this blocker bug for almost a year now! Does anybody know what the last relase of xalan is that does not yet have this caching optimization? Thanks!
        Hide
        Sébastien Tardif added a comment -

        The fix provided is trying to do the same thing as removing all the caching logic that the class is doing. So the real fix is to remove all the caching logics. So that m_readers and m_inUse class variable should be removed. The side effect is that synchronization is not needed anymore.

        The caching was an optimization that introduce too many problems:

        • in the majority of cases, application that deal with XML will be lot slower "playing with the XML" than creating an instance of XMLReader, so the optimization is useless
        • if an application want to cache something, they can cache other places or use AOP to do the same caching that was done
        • they is no official API in XMLReader to reset the XMLReader so that it can be cached but without its previous state
        • SAXParser has reset method but doesn't clean all the previous state in particular the one taking lot of memory
        • synchronization is a limiting factor for scalability
        • if a caller forget to put a finally block to release the XMLReader more important memory leak will occur, so the design was fragile
        Show
        Sébastien Tardif added a comment - The fix provided is trying to do the same thing as removing all the caching logic that the class is doing. So the real fix is to remove all the caching logics. So that m_readers and m_inUse class variable should be removed. The side effect is that synchronization is not needed anymore. The caching was an optimization that introduce too many problems: in the majority of cases, application that deal with XML will be lot slower "playing with the XML" than creating an instance of XMLReader, so the optimization is useless if an application want to cache something, they can cache other places or use AOP to do the same caching that was done they is no official API in XMLReader to reset the XMLReader so that it can be cached but without its previous state SAXParser has reset method but doesn't clean all the previous state in particular the one taking lot of memory synchronization is a limiting factor for scalability if a caller forget to put a finally block to release the XMLReader more important memory leak will occur, so the design was fragile
        Hide
        Eric Burke added a comment -

        The hack/fix I posted earlier can lead to NullPointerException. I added this line to the start of the method:

        if (reader == null)

        { return; }

        I realize this is a hacky workaround; sorry I don't have time to investigate a more solid fix. We've been running in production all day now, and with this fix (in addition to the code I listed in an earlier comment) the leak is gone.

        Show
        Eric Burke added a comment - The hack/fix I posted earlier can lead to NullPointerException. I added this line to the start of the method: if (reader == null) { return; } I realize this is a hacky workaround; sorry I don't have time to investigate a more solid fix. We've been running in production all day now, and with this fix (in addition to the code I listed in an earlier comment) the leak is gone.
        Hide
        Eric Burke added a comment -

        This affects Xalan 2.6.0 for us, although the bug says 2.7.0.

        Show
        Eric Burke added a comment - This affects Xalan 2.6.0 for us, although the bug says 2.7.0.
        Hide
        Eric Burke added a comment -

        We see this issue whenever we try to use Java 5. Our app runs fine under 1.4, but when we try to run under Java 5, it runs out of memory after about 4 hours. This is a server-side app running on Linux, JDK 1.5.0_06, and JBoss 4.0.3sp1. I profiled using YourKit Profiler and attached two screen shots showing the leak in action. I edited my own copy of XMLReaderManager like this:

        public synchronized void releaseXMLReader(XMLReader reader) {
        // If the reader that's being released is the cached reader
        // for this thread, mark it as no longer being in use.
        if (m_readers.get() == reader)

        { m_readers.set(null); // ERIC m_inUse.put(reader, Boolean.FALSE); }

        m_inUse.remove(reader); // ERIC
        }

        After that change, the leak is gone.

        Show
        Eric Burke added a comment - We see this issue whenever we try to use Java 5. Our app runs fine under 1.4, but when we try to run under Java 5, it runs out of memory after about 4 hours. This is a server-side app running on Linux, JDK 1.5.0_06, and JBoss 4.0.3sp1. I profiled using YourKit Profiler and attached two screen shots showing the leak in action. I edited my own copy of XMLReaderManager like this: public synchronized void releaseXMLReader(XMLReader reader) { // If the reader that's being released is the cached reader // for this thread, mark it as no longer being in use. if (m_readers.get() == reader) { m_readers.set(null); // ERIC m_inUse.put(reader, Boolean.FALSE); } m_inUse.remove(reader); // ERIC } After that change, the leak is gone.
        Hide
        Eric Burke added a comment -

        Another screen from a YourKit profiling session showing "GC Roots", further clarifying the XMLReaderManager leak.

        Show
        Eric Burke added a comment - Another screen from a YourKit profiling session showing "GC Roots", further clarifying the XMLReaderManager leak.
        Hide
        Eric Burke added a comment -

        Retained object sizes as captured using YourKit Java profiler, demonstrating how significant this leak is. Because Tomcat puts threads into a pool, the ThreadLocals never go away.

        Show
        Eric Burke added a comment - Retained object sizes as captured using YourKit Java profiler, demonstrating how significant this leak is. Because Tomcat puts threads into a pool, the ThreadLocals never go away.

          People

          • Assignee:
            Unassigned
            Reporter:
            Marko Strukelj
          • Votes:
            16 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

            • Created:
              Updated:

              Development