Uploaded image for project: 'Xerces-C++'
  1. Xerces-C++
  2. XERCESC-959

Surrogate characters mishandled by SAXPrint and SAX2Print.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Resolution: Fixed
    • Nightly build (please specify the date)
    • None
    • Samples/Tests
    • None
    • Operating System: All
      Platform: All
    • 21780

    Description

      From local CYGWIN build from CVS head (July 21, 2003):

      The SAXPrint and SAX2Print samples write supplemental characters as character
      references of their high and low surrogates. It looks like the problem might be
      in framework/XMLFormatter, as I don't see any code in there that checks for
      surrogates. If this is where the problem is, I would guess that the DOMWriter
      exhibits the same behaviour.

      Here's an example...

      Input to SAXPrint:

      <?xml version="1.0" encoding="UTF-8"?>
      <root>𐀀􏿿</root>

      Output from SAXPrint:

      <?xml version="1.0" encoding="LATIN1"?>
      <root>����</root>

      The surrogate characters (xD800-xDFFF) are not part of Char, and thus those
      char refs are illegal.

      Attachments

        Activity

          People

            Unassigned Unassigned
            mrglavas Michael Glavassevich
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: