Uploaded image for project: 'Xerces2-J'
  1. Xerces2-J
  2. XERCESJ-1274

Data corruption in sax parsed attributes after data contains < or >

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Invalid
    • None
    • None
    • SAX
    • None
    • JDK 1.6_03, Windows Vista

    Description

      This is for the Xerces that is bundled into the Java 1.6_03 release. I have also filed a bug with Sun, don't know the number yet.

      When XML attribute data contains < or >, after a certain number of attributes, data from one attribute gets written into another corrupting the data. I provided a simple sample program below that sax parses an xml string and spits out the results of each attribute.

      The string is

      <?xml version="1.0" encoding="UTF-8"?>
      <Test test1="11111 <" test2="22222 <" test3="33333 <" test4="44444 <" test5="55555 <" test6="66666 <" test7="77777 <" test8="88888 <" test9="99999 <" test10="101010101010 <" />

      but the xerces in Java 1.6_03 parses it as if it is:

      <?xml version="1.0" encoding="UTF-8"?>
      <Test test1="11111 <" test2="22222 <" test3="33333 <" test4="44444 <" test5="55555 <" test6="66666 <" test7="88888 <" test8="88888 <" test9="1010101" test10="101010101010 <" />

      Notice test7 and test9 have been corrupted.

      Test program:

      import java.io.*;
      import javax.xml.parsers.*;
      import org.xml.sax.*;
      import org.xml.sax.helpers.*;

      public class Test
      {
      private static class Handler extends DefaultHandler
      {

      public void startElement(String uri, String localName, String name, Attributes attributes) throws SAXException
      {
      super.startElement(uri, localName, name, attributes);
      for (int i = 0; i < attributes.getLength(); i++)

      { System.out.println (attributes.getLocalName(i) + " = " + attributes.getValue(i)); }

      }
      }

      public static String xmlString = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><Test test1=\"11111 <\" test2=\"22222 <\" test3=\"33333 <\" test4=\"44444 <\" test5=\"55555 <\" test6=\"66666 <\" test7=\"77777 <\" test8=\"88888 <\" test9=\"99999 <\" test10=\"101010101010 <\" />";

      public static void main(String[] args) throws Exception

      { SAXParser parser = SAXParserFactory.newInstance().newSAXParser(); parser.parse(new InputSource(new StringReader(xmlString)), new Handler()); }

      }

      Attachments

        Activity

          People

            Unassigned Unassigned
            dfrankson David Frankson
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: