This issue was found by Henry Zongaro.
If you try the following stylesheet, you'll see that the character x8C, which is not permitted in literal form in XML 1.1, is escaped when it appears in an element's character content, but it's not escaped when it is part of an attribute value.
<xsl:stylesheet xmlns:xsl="
http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" version="1.1"/>
<xsl:template match="/">
<out att="Œ">Œ</out>
</xsl:template>
</xsl:stylesheet>
When the serialized XML produced by this stylesheet is parsed by Xerces (depending perhaps on the version of Xerces) it goes into an infinite loop when it attempts to parse an attribute that contains an invalid character.
The CharInfo changes do the following:
1. Previously the CharInfo object for HTML,TEXT and XML were all cached in a static Hashtable. Seems good for performance, but the downside of this was that the CharInfo's getOutputStringForChar(char) method, that returned the entity for a given char was synchronized (e.g. map '<' to "<"). When generating HTML, which has lots of entities coming from the HTMLEntities.properties file, in a webserver this can be a bottleneck on a busy server.
The changes were to make each CharInfo object returned to the caller a mutable copy and not require synchronization any more.
Some Hashtables were changed to HashMap for performance.
Previously this isSpecialAttrChar() said that a lot of other characters were special, but now it is related only to entities.
Changes to isSpecialAttrChar() and isSpecialTextChar(). Basically these routines return true if there is an entity for character. However there is some internal tweaking to:
> output a literal tab as "	" in XML attribute values
> output a quote in an XML attribute as """
> leave a literal quote as-is in HTML or XML text nodes
> output less than sign as-is in HTML attribute values
2. Changes to ToStream method characters(final char chars[], final int start, final int length) is reworked in an effient way to cover characters in the C0 and C1 range to be written out as character references (except for tab, newline, carriage return). Also the line-separator 0x2028 will be written out as a character reference. This processing is done regardless of the XML version (1.0 or 1.1) but is good for XML 1.0 also, just in case it is is included as a generally parsed entity in an XML 1.1 file.
3. Changes to ToStream method writeAttrString()
4. Minor changes to ToXMLStream and ToHTMLStream to make the CharInfo object used to check for entities non-static, but one owned by that serializer, which drops the need for synchronization when looking up entities.