Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
Cpp-M3
-
None
-
None
Description
we are using both C++ and Java SDO in a project and discovered some
misbehavior in the C++ components with XML data converted from/to SDO if
the XML contains either escaped chars of CDATA. Java seems to do it
mostly right (see below)
When looking at the SDO (C++ M3) code and searching on the web (e.g.
[1]) it looks as if this topic seemed a bit, well, incomplete in the C++
world.
The problem (C++):
- loading an XML with CDATA inside works nicely, the CDATA remains
intact, therefore saving works nicely too. However, if I do a
DataObjectPtr->getCString(), I get the CDATA in the returned value -
means as a user I have to deal with that :-| - loading an XML with escaped (e.g. <) works too, libxml2 converts
these chars. getCString() returns the real text (e.g. "<"), but
saving does not re-insert the escaping - i.e. the resulting XML is
not usable anymore (TUSCANY-1553)
In Java this looks much better and quite as I'd expect it to: - loading XML with either constructs works
- using getCString() just returns the real text with the escaped
sections converted - saving works too, CDATA are lost but are rather converted back to
escaped XML - this is not the original XML anymore but at least it
is valid and logically it is the same as the input - Example:
Input XML:
<tns1:name>ü<>bla blub <![CDATA[ <<>> ]]></tns1:name>
getCString() in Java:
"ü<>bla blub <<>> "
Saving this as XML:
<tns1:name>ü<>bla blub <<>> </tns1:name>
The only questionable thing is the saved "ü" ... to be
converted back to ü or ü ?
Anyway, now the question: As it seems there were discussions going on
when SDO C++ has been implemented - has the approach above (as in Java)
ever been considered and, if so, why has it not been followed?
I believe that this would have been also much simpler than it is today:
- while parsing
- the cdata handler function of the SAX2 handler just
appends the text returned by libxml2 - escaped chars are converted by libxml2 anyway
- the property value now contains the real text
(e.g. "ü<>bla blub <<>> ") and returns it just as-is in getCString() - setting that property also just sets the passed-in value
- saving the property just calls libxml2 xmlTextWriterWriteString()
which should escape the special chars
Another advantage is that users don't need to worry about (un)escaping
special chars or CDATA as today. Disadvantage: API behavior changes.