Uploaded image for project: 'Xerces-C++'
  1. Xerces-C++
  2. XERCESC-1838

Unexpected validation result when comment is present

    XMLWordPrintableJSON

Details

    Description

      Schema:
      <?xml version="1.0" encoding="UTF-8"?>
      <xsd:schema targetNamespace="http://www.openuri.org/mySchema" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.openuri.org/mySchema" elementFormDefault="qualified" version="1.0">
      <xsd:element name="root">
      <xsd:complexType>
      <xsd:sequence>
      <xsd:element name="test" type="MyType" maxOccurs="unbounded"/>
      </xsd:sequence>
      </xsd:complexType>
      </xsd:element>
      <xsd:complexType name="MyType">
      <xsd:simpleContent>
      <xsd:extension base="xsd:byte">
      </xsd:extension>
      </xsd:simpleContent>
      </xsd:complexType>
      </xsd:schema>

      document:
      <?xml version="1.0" encoding="UTF-8"?>
      <p1:root xmlns:p1="http://www.openuri.org/mySchema"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.openuri.org/mySchema schema.xsd">
      <p1:test><!-- text -->34</p1:test>
      <p1:test>
      <!-- text -->34</p1:test>
      <p1:test>
      34</p1:test>
      <p1:test>34</p1:test>
      <p1:test>
      34
      </p1:test>
      </p1:root>

      SAX2Print output:
      <?xml version="1.0" encoding="LATIN1"?>
      <p1:root xsi:schemaLocation="http://www.openuri.org/mySchema schema.xsd">
      <p1:test>34</p1:test>
      <p1:test> 34
      Error at file Z:\eTools\bin\Debug/instance-1.xml, line 7, char 26
      Message: Datatype error: Type:InvalidDatatypeValueException, Message:Value ' 34' does not match re
      gular expression facet '[+\-]?[0-9]+'.
      </p1:test>
      <p1:test>34</p1:test>
      <p1:test>34</p1:test>
      <p1:test>34</p1:test>
      </p1:root>

      Note the difference between second and third 'test' element in the document: second has comment.

      Suggested fix - in SchemaValidator::normalizeWhiteSpace function last lines:
      was
      if (fCurReader->isWhitespace(*(srcPtr-1)))
      fTrailing = true;
      else
      fTrailing = false;

      now
      ...
      if (0 != (srcPtr) && fCurReader->isWhitespace((srcPtr-1)))
      fTrailing = true;
      else
      fTrailing = false;
      }

      Idea: isWhitespace returns 'true' for empty string, but here 'empty string' does not mean whitepace.
      Please, review.

      Attachments

        1. test.xml
          0.4 kB
          Boris Kolpackov
        2. test.xsd
          0.8 kB
          Boris Kolpackov

        Activity

          People

            bsk Boris Kolpackov
            alexeyme Alexey Miroshnichenko
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: