Uploaded image for project: 'Cocoon'
  1. Cocoon
  2. COCOON-2352

XMLEncoder doesn't support Unicode surrogate pairs

    XMLWordPrintableJSON

Details

    Description

      Whilst investigating an issue with the Sling project and support for emoji characters, I've come to notice that the XMLEncoder used by HTMLSerializer doesn't support Unicode surrogate pairs to represent higher order unicode characters.

      A simple unit test that demonstrates this issue is here:

      https://github.com/micronode/whistlepost/blob/master/whistlepost-rewrite-lib/src/test/groovy/org/apache/cocoon/components/serializers/encoding/XMLEncoderTest.groovy

      More background info here also: SLING-5973

      This seems to have been identified/addressed in other Apache projects also:

      https://issues.apache.org/jira/browse/THRIFT-3403?jql=text%20~%20%22surrogate%20pairs%22

      Attachments

        Issue Links

          Activity

            People

              ilgrosso Francesco Chicchiriccò
              fortuna Ben Fortuna
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: