Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
2.5.0, 2.6.0
-
None
-
Windows 2000
Description
Validating an XML instance against a Schema with an unbounded xsd:list type can take much greater than O processing resources, where n is the number of items in the list.
To reproduce use this Schema:
pq.xsd
<?xml version="1.0" encoding="utf-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:pqns="http://swsis.cambridge.arm.com/~dearlam/xercestest/" targetNamespace="http://swsis.cambridge.arm.com/~dearlam/xercestest/"
elementFormDefault="qualified" version="0.1">
<xs:annotation>
<xs:documentation xml:lang="en">
XML schema for Hofstadter's Gödel pq-System.
Test data for list data type validation.
</xs:documentation>
</xs:annotation>
<xs:element name="pqData" type="pqns:pqDataType"></xs:element>
<xs:complexType name="pqDataType">
<xs:complexContent>
<xs:restriction base="xs:anyType">
<xs:sequence minOccurs="1" maxOccurs="1">
<xs:element name="dashes" type="pqns:dashBlockType"></xs:element>
<xs:element name="p" type="xs:string" xsi:nill="true"></xs:element>
<xs:element name="dashes" type="pqns:dashBlockType"></xs:element>
<xs:element name="q" type="xs:string" xsi:nill="true"></xs:element>
<xs:element name="dashes" type="pqns:dashBlockType"></xs:element>
</xs:sequence>
</xs:restriction>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="porqType">
<xs:simpleContent>
<xs:extension base="xs:string"></xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name="dashBlockType">
<xs:simpleContent>
<xs:extension base="pqns:dataDashes"></xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:simpleType name="Dash">
<xs:restriction base="xs:string">
<xs:pattern value="[\-]"></xs:pattern>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="dataDashes">
<xs:restriction base="pqns:DashList">
<xs:minLength value="0" />
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="DashList">
<xs:list itemType="pqns:Dash"></xs:list>
</xs:simpleType>
</xs:schema>
and this XML file
pqData0.xml
<?xml version="1.0" encoding="utf-8" ?>
<pqData xmlns='http://swsis.cambridge.arm.com/~dearlam/xercestest/'
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://swsis.cambridge.arm.com/~dearlam/xercestest/
http://swsis.cambridge.arm.com/~dearlam/xercestest/pq.xsd">
<dashes>
- -
</dashes>
<p/>
<dashes>-</dashes>
<q/>
<dashes>-</dashes>
</pqData>
(replacing swsis.cambridge.arm.com/~dearlam/xercestest with your location)
Then use
domprint -wfpp=on pqData0.xml
and
domprint -n -s -wfpp=on pqData0.xml
to print the XML non-validating and validating.
They print in equal short time. OK.
Now, edit pqData0.xml as pqData1.xml and replace
- -
with 4000 lines of - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
This gives a 500Kb file (which mimics my real data).
If you then try
domprint -wfpp=on pqData1.xml
and
domprint -n -s -wfpp=on pqData1.xml
the first prints instantly (pipe it to NUL if you like), but the second consumes 99% CPU for 230 seconds, then prints.
That's about 2 bytes per second !
–
(My suspicion is XMLString::tokenizeString is using subString() to calculate the string length
way too many times...)
kind regards,
David