Uploaded image for project: 'Sling'
  1. Sling
  2. SLING-5973

HTMLSerializer not handling some unicode characters (emoji, etc.)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Extensions
    • None

    Description

      I've noticed that when I have unicode special characters (e.g. emoji) in my sling content and the sling rewriter is enabled the characters are not output correctly to the browser. For example:

      😁

      becomes

      ��

      If I disable the rewriter pipeline the output is as expected.

      I've looked in the code and I suspect the issue is in the HTMLSerializer from the Cocoon library, however I'm not sure why as it should be using the default encoding for output (which is UTF-8). My rewriter pipeline is using the default html-generator and html-serializer provided by sling.

      My code is available on GitHub here:

      https://github.com/Whistlepost/emojistrip

      It provides a very simple app/content project pair with some emoji characters in the content (see src/main/resources/SLING-INF/content/phrases.json). Many thanks.

      Attachments

        1. emoji-no-sling-rewriter.png
          36 kB
          Ben Fortuna
        2. emoji-with-sling-rewriter.png
          26 kB
          Ben Fortuna

        Issue Links

          Activity

            People

              Unassigned Unassigned
              fortuna Ben Fortuna
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: