Uploaded image for project: 'Xerces-C++'
  1. Xerces-C++
  2. XERCESC-1336

Xerces C++ defines an encoding-string that Xerces/Java refuses to parse

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Won't Fix
    • 2.6.0
    • None
    • None
    • None
    • Solaris 6, XercesJ 2.4

    Description

      We are using Xerces C++ to create XML-Messages that are later parsed by Xerces/Java.

      When we use the define XMLUni::fgISO88591EncodingString for setting the encoding, the XML-Message contains "ISO8859-1", because the string is defined as
      "chLatin_I, chLatin_S, chLatin_O, chDigit_8, chDigit_8, chDigit_5, chDigit_9, chDash, chDigit_1, chNull".

      When we later use Xerces/Java to parse this file, we get the following error:

      [Fatal Error] :1:43: Invalid encoding name "ISO8859-1".

      It seems that Xerces/Java only knows the encoding "ISO-8859-1" (with a dash), but not "ISO8859-1" (without dash).

      The XML-Specification states that "ISO-8859-1" (with a dash) SHOULD be used, look at http://www.w3.org/TR/2004/REC-xml-20040204/#charencoding

      Additionally the file src/xercesc/util/XMLUni.cpp defines more variants of the encoding, where we are not sure which of them are supported by Xerces/Java.

      So in my opinion either Xerces C++ should not provide that define any more, or Xerces/Java should be enhanced to accept that encoding-string.

      Attachments

        Activity

          People

            Unassigned Unassigned
            dominik.stadler@gmx.at Dominik Stadler
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: