Description
escapeXml lets non-text characters pass through into XML files:
scala> org.apache.commons.lang3.StringEscapeUtils.escapeXml("\u0004").codePointAt(0)
res4: Int = 4
I would expect the result to be an exception – either from StringEscapeUtils (refusing to encode it) or, preferably, from String.codePointAt, complaining that the string is empty. \u0004 is not a valid character in XML 1.0, and there is no way to represent it in an XML document – not even by escaping it.
Wikipedia summarizes the characters that are not allowed in XML – even after escaping: http://en.wikipedia.org/wiki/Valid_characters_in_XML. The reason for disallowing them: XML is a text interchange format, and control characters are not text.
If StringEscapeUtils.escapeXml allows invalid XML characters through – whether escaped or not – it generates invalid XML. Valid XML parsers will refuse to read such files.
Attachments
Issue Links
- is related to
-
LANG-963 Clarify behavior of StringEscapeUtils.escapeXml by renaming it to StringEscapeUtils.escapeXmlEntities
- Closed