Uploaded image for project: 'Xerces2-J'
  1. Xerces2-J
  2. XERCESJ-1061

Regex "$" and "^" characters treated as special chars in conflict with XML Schema spec

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.6.2
    • 2.9.0
    • None
    • Test Environment: Win XP SP1, JDK v1.5.0_02, Xerces v2.6.2 (manually used; overrides any other, if packaged with the JDK)

    Description

      Xerces rejects the following schema:
      <xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema'>
      <xs:element name="test">
      <xs:simpleType>
      <xs:restriction base="xs:string">
      <xs:pattern value="$?[0-9]+\.[0-9]

      {2}" />
      </xs:restriction>
      </xs:simpleType>
      </xs:element>
      </xs:schema>

      The code within org.apache.xerces.impl.xpath.regex.RegexParser throws a parser exception over the use of the "$?" characters, unless the "$" character is escaped. For example, this works:

      <xs:pattern value="\$?[0-9]+\.[0-9]{2}

      " />

      The fundamental problem is that the Xerces RegexParser code does NOT follow the XML Schema specification, as defined by this URL:
      http://www.w3.org/TR/2000/WD-xmlschema-2-20000407/#dt-metac

      Specifically, the XML Schema specification does NOT give special meaning to the "$" and "^" characters, whereas the RegexParser code seems to indicate that these characters have the normal, standard UNIX definitions of "end-of-line" and "start-of-line" anchors respectively.

      Regards,

      Darien Kindlund
      The MITRE Corporation
      InfoSec Engr / Scientist, Sr.
      kindlund@mitre.org

      Attachments

        1. regexparser.java
          43 kB
          Chris Carman
        2. RegexParser.diff
          1 kB
          Chris Carman

        Activity

          People

            mrglavas@ca.ibm.com Michael Glavassevich
            kindlund Darien Kindlund
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: