Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Invalid
-
None
-
None
-
None
-
JDK 1.6_03, Windows Vista
Description
This is for the Xerces that is bundled into the Java 1.6_03 release. I have also filed a bug with Sun, don't know the number yet.
When XML attribute data contains < or >, after a certain number of attributes, data from one attribute gets written into another corrupting the data. I provided a simple sample program below that sax parses an xml string and spits out the results of each attribute.
The string is
<?xml version="1.0" encoding="UTF-8"?>
<Test test1="11111 <" test2="22222 <" test3="33333 <" test4="44444 <" test5="55555 <" test6="66666 <" test7="77777 <" test8="88888 <" test9="99999 <" test10="101010101010 <" />
but the xerces in Java 1.6_03 parses it as if it is:
<?xml version="1.0" encoding="UTF-8"?>
<Test test1="11111 <" test2="22222 <" test3="33333 <" test4="44444 <" test5="55555 <" test6="66666 <" test7="88888 <" test8="88888 <" test9="1010101" test10="101010101010 <" />
Notice test7 and test9 have been corrupted.
Test program:
import java.io.*;
import javax.xml.parsers.*;
import org.xml.sax.*;
import org.xml.sax.helpers.*;
public class Test
{
private static class Handler extends DefaultHandler
{
public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException
{
super.startElement(uri, localName, name, attributes);
for (int i = 0; i < attributes.getLength(); i++)
}
}
public static String xmlString = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><Test test1=\"11111 <\" test2=\"22222 <\" test3=\"33333 <\" test4=\"44444 <\" test5=\"55555 <\" test6=\"66666 <\" test7=\"77777 <\" test8=\"88888 <\" test9=\"99999 <\" test10=\"101010101010 <\" />";
public static void main(String[] args) throws Exception
{ SAXParser parser = SAXParserFactory.newInstance().newSAXParser(); parser.parse(new InputSource(new StringReader(xmlString)), new Handler()); }}