Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
2.6.2
-
None
-
Test Environment: Win XP SP1, JDK v1.5.0_02, Xerces v2.6.2 (manually used; overrides any other, if packaged with the JDK)
Description
Xerces rejects the following schema:
<xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema'>
<xs:element name="test">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="$?[0-9]+\.[0-9]
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:schema>
The code within org.apache.xerces.impl.xpath.regex.RegexParser throws a parser exception over the use of the "$?" characters, unless the "$" character is escaped. For example, this works:
<xs:pattern value="\$?[0-9]+\.[0-9]{2}
" />
The fundamental problem is that the Xerces RegexParser code does NOT follow the XML Schema specification, as defined by this URL:
http://www.w3.org/TR/2000/WD-xmlschema-2-20000407/#dt-metac
Specifically, the XML Schema specification does NOT give special meaning to the "$" and "^" characters, whereas the RegexParser code seems to indicate that these characters have the normal, standard UNIX definitions of "end-of-line" and "start-of-line" anchors respectively.
Regards,
–
Darien Kindlund
The MITRE Corporation
InfoSec Engr / Scientist, Sr.
kindlund@mitre.org