XalanJ2
  1. XalanJ2
  2. XALANJ-2547

Transformer creates duplicate CarriageReturn in CDATA section on windows

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.7.1
    • Fix Version/s: None
    • Component/s: transformation
    • Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.)
    • Labels:
    • Environment:
      Windows 7 64

      Description

      When I have a Windows Line Break (CR+LF) in a CDATA section, the transformer turns it into (CR+CR+LF), which most editors display as an extra blank line. Test code:

      import static org.junit.Assert.assertEquals;

      import java.io.StringWriter;

      import javax.xml.parsers.DocumentBuilderFactory;
      import javax.xml.parsers.ParserConfigurationException;
      import javax.xml.transform.OutputKeys;
      import javax.xml.transform.Transformer;
      import javax.xml.transform.TransformerException;
      import javax.xml.transform.TransformerFactory;
      import javax.xml.transform.dom.DOMSource;
      import javax.xml.transform.stream.StreamResult;

      import org.junit.Test;
      import org.w3c.dom.CDATASection;
      import org.w3c.dom.Document;
      import org.w3c.dom.Element;

      public class XmlCdataWithCrTest {
      private static final String LINE_SEPARATOR = System.getProperty("line.separator");

      @Test
      public void testXmlCdataWithCr() throws TransformerException, ParserConfigurationException

      { Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument(); Element root = doc.createElement("root"); doc.appendChild(root); CDATASection cdataSection = doc.createCDATASection("a" + LINE_SEPARATOR + "b"); root.appendChild(cdataSection); StringWriter writer = new StringWriter(); Transformer transformer = TransformerFactory.newInstance().newTransformer(); transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8"); transformer.setOutputProperty(OutputKeys.INDENT, "yes"); transformer.transform(new DOMSource(doc), new StreamResult(writer)); String result = writer.toString();// .replace("\r\r", "\r"); String expected = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><root><![CDATA[a" + LINE_SEPARATOR + "b]]></root>" + LINE_SEPARATOR; assertEquals(expected, result); }

      }

        Activity

        Daniel Schwering created issue -
        Hide
        Ian Beaumont added a comment -

        My understanding is that the CDATA data should not be changed. According to this
        http://www.w3.org/TR/REC-xml/#sec-line-ends
        only "parsed entities" should have the line breaks changed and don't CDATA sections contain "unparsed character data"?

        Show
        Ian Beaumont added a comment - My understanding is that the CDATA data should not be changed. According to this http://www.w3.org/TR/REC-xml/#sec-line-ends only "parsed entities" should have the line breaks changed and don't CDATA sections contain "unparsed character data"?

          People

          • Assignee:
            Unassigned
            Reporter:
            Daniel Schwering
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development