Uploaded image for project: 'Tuscany'
  1. Tuscany
  2. TUSCANY-4075

SDO C++ handling of XML CDATA and escaped chars inconsistent

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • Cpp-M3
    • None
    • C++ SDO
    • None

    Description

      we are using both C++ and Java SDO in a project and discovered some
      misbehavior in the C++ components with XML data converted from/to SDO if
      the XML contains either escaped chars of CDATA. Java seems to do it
      mostly right (see below)

      When looking at the SDO (C++ M3) code and searching on the web (e.g.
      [1]) it looks as if this topic seemed a bit, well, incomplete in the C++
      world.

      The problem (C++):

      • loading an XML with CDATA inside works nicely, the CDATA remains
        intact, therefore saving works nicely too. However, if I do a
        DataObjectPtr->getCString(), I get the CDATA in the returned value -
        means as a user I have to deal with that :-|
      • loading an XML with escaped (e.g. <) works too, libxml2 converts
        these chars. getCString() returns the real text (e.g. "<"), but
        saving does not re-insert the escaping - i.e. the resulting XML is
        not usable anymore (TUSCANY-1553)
        In Java this looks much better and quite as I'd expect it to:
      • loading XML with either constructs works
      • using getCString() just returns the real text with the escaped
        sections converted
      • saving works too, CDATA are lost but are rather converted back to
        escaped XML - this is not the original XML anymore but at least it
        is valid and logically it is the same as the input
      • Example:
        Input XML:
        <tns1:name>ü<>bla blub <![CDATA[ <<>> ]]></tns1:name>
        getCString() in Java:
        "ü<>bla blub <<>> "
        Saving this as XML:
        <tns1:name>ü<>bla blub <<>> </tns1:name>
        The only questionable thing is the saved "ü" ... to be
        converted back to ü or ü ?

      Anyway, now the question: As it seems there were discussions going on
      when SDO C++ has been implemented - has the approach above (as in Java)
      ever been considered and, if so, why has it not been followed?
      I believe that this would have been also much simpler than it is today:

      • while parsing
      • the cdata handler function of the SAX2 handler just
        appends the text returned by libxml2
      • escaped chars are converted by libxml2 anyway
      • the property value now contains the real text
        (e.g. "ü<>bla blub <<>> ") and returns it just as-is in getCString()
      • setting that property also just sets the passed-in value
      • saving the property just calls libxml2 xmlTextWriterWriteString()
        which should escape the special chars

      Another advantage is that users don't need to worry about (un)escaping
      special chars or CDATA as today. Disadvantage: API behavior changes.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tge Thomas Gentsch
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: