Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
All
Description
The HTML specification disallows certain code points from appearing in HTML files (XML has essentially the same list, minus the high ISO characters) as specified in http://www.w3.org/TR/REC-html40/sgml/sgmldecl.html. The Trinidad HtmlEscapes utilities allow the low control characters and the high characters that are technically outside of Unicode to be output. This causes problems if the content is validated.
The fix is to use numeric character references such as rather than outputting code point 1 directly. In addition, Internet Explorer has a bug where is output as "" so it is preferable to suppress this character rather then outputting it.