Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.1.12
-
None
Description
Whilst investigating an issue with the Sling project and support for emoji characters, I've come to notice that the XMLEncoder used by HTMLSerializer doesn't support Unicode surrogate pairs to represent higher order unicode characters.
A simple unit test that demonstrates this issue is here:
https://github.com/micronode/whistlepost/blob/master/whistlepost-rewrite-lib/src/test/groovy/org/apache/cocoon/components/serializers/encoding/XMLEncoderTest.groovy
More background info here also: SLING-5973
This seems to have been identified/addressed in other Apache projects also:
https://issues.apache.org/jira/browse/THRIFT-3403?jql=text%20~%20%22surrogate%20pairs%22
A simple unit test that demonstrates this issue is here:
https://github.com/micronode/whistlepost/blob/master/whistlepost-rewrite-lib/src/test/groovy/org/apache/cocoon/components/serializers/encoding/XMLEncoderTest.groovy
More background info here also: SLING-5973
This seems to have been identified/addressed in other Apache projects also:
https://issues.apache.org/jira/browse/THRIFT-3403?jql=text%20~%20%22surrogate%20pairs%22
Attachments
Issue Links
- is related to
-
SLING-5973 HTMLSerializer not handling some unicode characters (emoji, etc.)
- Open
- requires
-
COCOON-2356 Set source and target compatibility to JDK 1.5
- Closed
- links to