Xerces2-J
  1. Xerces2-J
  2. XERCESJ-1163

javax.xml.validation.Validator#validate implementation does not support a DOMSource argument

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 2.8.0
    • Fix Version/s: None
    • Labels:
      None

      Description

      Validator#validate implementation does not support a DOMSource argument. The following SAXParseException is always thrown:

      Exception in thread "main" org.xml.sax.SAXParseException: cvc-elt.1: Cannot find the declaration of element 'xxx'.

      The problem is not seen in the 1.5 jdk.

      I've supplied a test class that succesfully validates an xml instance document using a StreamSource and subsequently fails to perform the validation against a DOMSource representation of the same xml.

      import java.io.StringReader;
      import java.io.IOException;
      import javax.xml.XMLConstants;
      import javax.xml.parsers.DocumentBuilder;
      import javax.xml.parsers.DocumentBuilderFactory;
      import javax.xml.parsers.ParserConfigurationException;
      import javax.xml.transform.stream.StreamSource;
      import javax.xml.transform.dom.DOMSource;
      import javax.xml.validation.Schema;
      import javax.xml.validation.SchemaFactory;
      import javax.xml.validation.Validator;
      import org.xml.sax.SAXException;
      import org.xml.sax.InputSource;
      import org.w3c.dom.Document;

      public final class ValidatorBug {

      private static final String SCHEMA =
      "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
      "<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" elementFormDefault=\"qualified\" attributeFormDefault=\"unqualified\">\n" +
      " <xs:element name=\"root\"/>\n" +
      "</xs:schema>";

      private static final String XML = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><root/>";

      public static void main(String[] args) throws SAXException, IOException, ParserConfigurationException

      { SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); Schema schema = schemaFactory.newSchema(new StreamSource(new StringReader(SCHEMA))); Validator validator = schema.newValidator(); System.out.println("\nvalidating stream source"); validator.validate(new StreamSource(new StringReader(XML))); // <--- WORKS System.out.println("valid"); DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder(); Document document = documentBuilder.parse(new InputSource(new StringReader(XML))); System.out.println("\nvalidating DOM source"); validator.validate(new DOMSource(document)); // <--- PROBLEM System.out.println("valid"); }

      }

      The exception:

      Exception in thread "main" org.xml.sax.SAXParseException: cvc-elt.1: Cannot find the declaration of element 'root'.
      at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
      at org.apache.xerces.util.ErrorHandlerWrapper.error(Unknown Source)
      at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
      at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
      at org.apache.xerces.impl.xs.XMLSchemaValidator.handleStartElement(Unknown Source)
      at org.apache.xerces.impl.xs.XMLSchemaValidator.startElement(Unknown Source)
      at org.apache.xerces.jaxp.validation.DOMValidatorHelper.beginNode(Unknown Source)
      at org.apache.xerces.jaxp.validation.DOMValidatorHelper.validate(Unknown Source)
      at org.apache.xerces.jaxp.validation.DOMValidatorHelper.validate(Unknown Source)
      at org.apache.xerces.jaxp.validation.ValidatorImpl.validate(Unknown Source)
      at javax.xml.validation.Validator.validate(Validator.java:82)
      at ValidatorBug.main(ValidatorBug.java:42)

      (This bug is represented by XERCESJ-1132 and XERCESJ-1161, but they were in the wrong component)

        Activity

        Steven Grossman created issue -
        Hide
        Michael Glavassevich added a comment -

        There is a warning in the Javadoc for DOMSource [1] which says: "Note that XSLT requires namespace support. Attempting to transform a DOM that was not constructed with a namespace-aware parser may result in errors. Parsers can be made namespace aware by calling DocumentBuilderFactory.setNamespaceAware(boolean awareness)." The same applies to XML schema validation which also requires namespace support. Schema validation is only defined for XML documents which have an infoset [2][3]. This implies both well-formedness and namespace conformance, the second of which is not checked by a non-namespace-aware parser.

        Each of the element/attribute nodes in a DOM built from a non-namespace-aware parser will have a null [4] local name. Note that [local name] is a required [2] property for both element and attribute information items. Validation of an input which is missing required infoset properties is undefined. This only "works" with Java 5.0 because it is attempting to fix-up the input, quite likely trying to make sense of documents which are not conformant [5] to the namespaces specification. I don't think this is something Xerces should be doing. You wouldn't want a compiler to try fixing syntax errors in source code by guessing what you meant. What happens when it's wrong? In order for the validator to behave predictably, you must provide it with an input constructed by a namespace-aware parser (i.e. factory.setNamespaceAware(true)).

        [1] http://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/transform/dom/DOMSource.html
        [2] http://www.w3.org/TR/xmlschema-1/#infoset
        [3] http://www.w3.org/TR/xmlschema-1/#concepts-data-model
        [4] http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-2141741547
        [5] http://www.w3.org/TR/REC-xml-names/#Conformance

        Show
        Michael Glavassevich added a comment - There is a warning in the Javadoc for DOMSource [1] which says: "Note that XSLT requires namespace support. Attempting to transform a DOM that was not constructed with a namespace-aware parser may result in errors. Parsers can be made namespace aware by calling DocumentBuilderFactory.setNamespaceAware(boolean awareness)." The same applies to XML schema validation which also requires namespace support. Schema validation is only defined for XML documents which have an infoset [2] [3] . This implies both well-formedness and namespace conformance, the second of which is not checked by a non-namespace-aware parser. Each of the element/attribute nodes in a DOM built from a non-namespace-aware parser will have a null [4] local name. Note that [local name] is a required [2] property for both element and attribute information items. Validation of an input which is missing required infoset properties is undefined. This only "works" with Java 5.0 because it is attempting to fix-up the input, quite likely trying to make sense of documents which are not conformant [5] to the namespaces specification. I don't think this is something Xerces should be doing. You wouldn't want a compiler to try fixing syntax errors in source code by guessing what you meant. What happens when it's wrong? In order for the validator to behave predictably, you must provide it with an input constructed by a namespace-aware parser (i.e. factory.setNamespaceAware(true)). [1] http://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/transform/dom/DOMSource.html [2] http://www.w3.org/TR/xmlschema-1/#infoset [3] http://www.w3.org/TR/xmlschema-1/#concepts-data-model [4] http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-2141741547 [5] http://www.w3.org/TR/REC-xml-names/#Conformance
        Michael Glavassevich made changes -
        Field Original Value New Value
        Resolution Won't Fix [ 2 ]
        Status Open [ 1 ] Resolved [ 5 ]
        Hide
        Ortwin Glück added a comment -

        Thanks Michael for this excellent explanation. I have a use case that is a bit special and I am not sure if this is a Xerces or a JDOM problem.

        I create a JDOM with all nodes in NO_NAMESPACE (structure is defined by customer). This JDOM is subsequently converted to a Xerces DOM with the JDOM's DOMOutputter. This would use Document.createElement instead of Document.createElementNS to create the Elements for example as I use NO_NAMESPACE. This way I always get a DOM without namespace support. Now I want to validate this DOM against a schema (provided by the customer). The schema does not declare a target namespace. And the XML files reference it with noNamespaceSchemaLocation. This seems legal to me. But with no namespace support in the DOM I can not use the schema validation due to this issue.

        Would it be possible to use createElementNS instead here, even though I have no namespace? Then it can be solved in JDOM's DOMOutputter.
        If not, then you should think about how to validate a DOM with no namespace

        Show
        Ortwin Glück added a comment - Thanks Michael for this excellent explanation. I have a use case that is a bit special and I am not sure if this is a Xerces or a JDOM problem. I create a JDOM with all nodes in NO_NAMESPACE (structure is defined by customer). This JDOM is subsequently converted to a Xerces DOM with the JDOM's DOMOutputter. This would use Document.createElement instead of Document.createElementNS to create the Elements for example as I use NO_NAMESPACE. This way I always get a DOM without namespace support. Now I want to validate this DOM against a schema (provided by the customer). The schema does not declare a target namespace. And the XML files reference it with noNamespaceSchemaLocation. This seems legal to me. But with no namespace support in the DOM I can not use the schema validation due to this issue. Would it be possible to use createElementNS instead here, even though I have no namespace? Then it can be solved in JDOM's DOMOutputter. If not, then you should think about how to validate a DOM with no namespace
        Hide
        Steven Grossman added a comment -

        Makes sense. I agree with your viewpoint of failing fast instead of attempting to fix up the input.

        I think users would benefit from an IllegalArgumentException being thrown here. The "Cannot find the declaration of element" exception fails to convey that the user provided a bad input.

        Thanks for your response.

        Show
        Steven Grossman added a comment - Makes sense. I agree with your viewpoint of failing fast instead of attempting to fix up the input. I think users would benefit from an IllegalArgumentException being thrown here. The "Cannot find the declaration of element" exception fails to convey that the user provided a bad input. Thanks for your response.
        Hide
        Michael Glavassevich added a comment -

        Ortwin,

        Yes. Unless you're writing a pure non-namespace-aware application [1] (which doesn't interact with namespace-aware processing models: XML schema, XSLT, XPath, XInclude, etc...), you should never use the DOM Level 1 methods: createElement/createAttribute. To create a namespace-aware element node with no namespace, you would call createElementNS(null, <<ELEMENT_NAME>>) [2].

        [1] http://www.w3.org/DOM/faq.html#create
        [2] http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-DocCrElNS

        Steven,

        I agree. This error message is particuarly misleading because it uses the tag name as the replacement parameter instead of the local name which is null. I plan to change the schema validator so that it checks for null local names and reports an error for this malformed input.

        Show
        Michael Glavassevich added a comment - Ortwin, Yes. Unless you're writing a pure non-namespace-aware application [1] (which doesn't interact with namespace-aware processing models: XML schema, XSLT, XPath, XInclude, etc...), you should never use the DOM Level 1 methods: createElement/createAttribute. To create a namespace-aware element node with no namespace, you would call createElementNS(null, <<ELEMENT_NAME>>) [2] . [1] http://www.w3.org/DOM/faq.html#create [2] http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-DocCrElNS Steven, I agree. This error message is particuarly misleading because it uses the tag name as the replacement parameter instead of the local name which is null. I plan to change the schema validator so that it checks for null local names and reports an error for this malformed input.
        Hide
        Ortwin Glück added a comment -

        Thanks Michael. Passing the problem on to JDOM. People experiencing the problem can find a JDOM patch here: http://www.odi.ch/weblog/posting.php?posting=301

        Show
        Ortwin Glück added a comment - Thanks Michael. Passing the problem on to JDOM. People experiencing the problem can find a JDOM patch here: http://www.odi.ch/weblog/posting.php?posting=301
        Mark Thomas made changes -
        Workflow jira [ 12369017 ] Default workflow, editable Closed status [ 12575367 ]
        Mark Thomas made changes -
        Workflow Default workflow, editable Closed status [ 12575367 ] jira [ 12598025 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Open Open Resolved Resolved
        7d 15h 29m 1 Michael Glavassevich 23/May/06 13:26

          People

          • Assignee:
            Unassigned
            Reporter:
            Steven Grossman
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development