Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.1.12, 2.3.0
-
None
-
None
-
Patch available
Description
The XHTMLSerializer, or, more specifically, the XHMLEncoder, from the serializers block in Cocoon 2.1.x escapes all characters with a corresponding HTML 4.0 character entity reference into this entity reference. This causes issues with inline JavaScript, since e.g. the double quotes are transformed to " which causes a JavaScript parsing error. Another minor negative effect is the increased document size.
If I understand the W3C correctly, see e.g. [2], the recommended approach is to use the character set of the encoding as far as possible,
and use escapes only in exceptional circumstances. I didn't find a reason why the XHTMLSerializer uses escapes, but I suspect that it is related to browser compatibility issues.
Maybe we could make this behaviour configurable, e.g.
<use-entity-references>true|false</use-entity-references>
[1] http://www.nabble.com/Problem-with-XHTMLSerializers-to1311360.html#a1311360
[2] http://www.w3.org/International/tutorials/tutorial-char-enc/
If I understand the W3C correctly, see e.g. [2], the recommended approach is to use the character set of the encoding as far as possible,
and use escapes only in exceptional circumstances. I didn't find a reason why the XHTMLSerializer uses escapes, but I suspect that it is related to browser compatibility issues.
Maybe we could make this behaviour configurable, e.g.
<use-entity-references>true|false</use-entity-references>
[1] http://www.nabble.com/Problem-with-XHTMLSerializers-to1311360.html#a1311360
[2] http://www.w3.org/International/tutorials/tutorial-char-enc/