Uploaded image for project: 'XalanJ2'
  1. XalanJ2
  2. XALANJ-2595

Non-BMP characters mangled during XSLT transform

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • transformation
    • Security Level: No security risk; visible to anyone (Ordinary problems in Xalan projects. Anybody can view the issue.)
    • None

    Description

      This XML:
      <root>
      <title>𐍀𐌰𐍂𐌹𐍃</title>
      </root>

      Run through this stylesheet:
      <?xml version="1.0" encoding="utf-8"?>
      <xsl:stylesheet
      version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xmlns:norm="ORG.oclc.util.NormalFormC"
      xmlns:v="http://viaf.org/viaf/terms#"
      exclude-result-prefixes="xsl norm v">

      <xsl:output
      encoding="UTF-8"
      method="html"
      version="4.0"/>

      <xsl:template match="root">
      <html>
      <head><title><xsl:value-of select="title"/></title></head>
      <body><h1><xsl:value-of select="title"/></h1></body>
      </html>
      </xsl:template>
      </xsl:stylesheet>

      Produces this bogus output:
      <?xml version="1.0" encoding="UTF-8"?>
      <html>
      <head><title>����������</title>
      </head>
      <body><h1>����������</h1>
      </body>
      </html>

      Attachments

        Activity

          People

            shathaway Steven J. Hathaway
            ralphlevan Ralph LeVan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: