Uploaded image for project: 'Tuscany'
  1. Tuscany
  2. TUSCANY-1088

SDO should tolerate malformed XML

    XMLWordPrintableJSON

Details

    Description

      I had some off-line discussion with Frank and Yang. Here is the summary:

      As an improvement to consumability, SDO should tolerate some malformed XML. XML documents are often less than well-formed. Rather than failing on deserialization when a document does not completely conform to its schema, we should consider making some assumptions and continuing on. Some competitor technologies do this today.

      Here's an example. Say we have this schema:

      <?xml version="1.0" encoding="UTF-8"?>
      <xsd:schema targetNamespace="http://QuickTest/HelloWorld"
      xmlns:tns="http://QuickTest/HelloWorld"
      xmlns:xsd="http://www.w3.org/2001/XMLSchema"
      elementFormDefault="qualified">
      <xsd:element name="sayHello">
      <xsd:complexType>
      <xsd:sequence>
      <xsd:element name="input1" nillable="true"
      type="xsd:string" />
      </xsd:sequence>
      </xsd:complexType>
      </xsd:element>
      </xsd:schema>

      If we get an xml that looks like this:

      <?xml version="1.0" encoding="UTF-8"?>
      <tns:sayHello xmlns:tns="http://QuickTest/HelloWorld"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://QuickTest/HelloWorld HelloWorldMessage.xsd ">
      <input1>input1</input1>
      </tns:sayHello>

      then we will fail validating this since input1 isn't fully qualified. Here's the xml that would work:

      <?xml version="1.0" encoding="UTF-8"?>
      <tns:sayHello xmlns:tns="http://QuickTest/HelloWorld"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://QuickTest/HelloWorld HelloWorldMessage.xsd ">
      <tns:input1>tns:input1</tns:input1>
      </tns:sayHello>

      Frank mentioned 2 potential approaches:

      1. Read the element in as if it was an open content property. If you reserialize it would be the same (still invalid).
      2. If a property with the same name (but different namespace) exists, then associate it with that. When you reserialize it will be then be correct.

      The later seems the best approach.

      Yang also contributed the following:

      It's friendly to tolerate if a user forgets to qualify a local element. There're 3 scenarios may not have the same elementFormDefault="qualified" enforcement policy. What do you think?

      3-1. <tns:sayHello xmlns:tns="http://QuickTest/HelloWorld">
      <input1>input1</input1>
      </tns:sayHello>

      The author may have forgot to qualify "input1" element, although "input1" may also be a global element without NameSpace.
      It's friendly to tolerate.

      3-2. <tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns:onPurpose="differentNameSpace">
      <onPurpose:input1>input1</onPurpose:input1>
      </tns:sayHello>
      The author has qualified "input1" element; I'm not confident we should tolerate.

      3-3. <tns:sayHello xmlns:tns="http://QuickTest/HelloWorld" xmlns="differentNameSpace"> <!-- xmlns= declares all unqualified elements/attributes under "differentNameSpace" -->
      <input1>input1</input1>
      </tns:sayHello>
      It's hard to tell if the author may have forgot to qualify "input1" element or not.
      I bet on not. Should we tolerate?

      Attachments

        1. patch
          10 kB
          Yang ZHONG
        2. patch
          13 kB
          Yang ZHONG
        3. FormTestCase.java
          3 kB
          Yang ZHONG

        Activity

          People

            kgoodson Kelvin Goodson
            kwilliams Kevin Joe Williams
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: