Uploaded image for project: 'Spatial Information Systems'
  1. Spatial Information Systems
  2. SIS-152

Consider using XSLT for handling XML documents compliant to different version of the standards

    XMLWordPrintableJSON

    Details

    • Type: Task
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.3
    • Fix Version/s: None
    • Component/s: Utilities
    • Labels:
      None

      Description

      OGC standards are updated once every few years. When a standard get significant changes, OGC produces a new XML schema with a new namespace URI. For example when upgrading from GML 3.1 to 3.2, the namespace URI changed from http://www.opengis.net/gml to http://www.opengis.net/gml/3.2.

      The most straightforward way to support different GML versions with JAXB is to have a different set of classes for each GML version. This is made easy by running the xjc compiler on the XML schemas provided by OGC. But OGC/ISO standards have thousands of elements, and duplicating all of them for every version has many inconvenient:

      • Massive code duplication (hundreds of classes, many of them strictly identical except for the namespace).
      • Handling the above-cited classes duplication requires either a bunch of "if (x instanceof Y)" statements in every SIS corners (inconceivable), or to edit the xjc output in order to give them a common parent class or interface.
      • The namespaces of all versions appear in the xmlns attributes of the root element (we can not always create separated JAXB contexts), which is confusing and prevent usage of usual prefixes for all versions except one.

      An alternative is to support "natively" (through JAXB annotations) only one version of each standard, and transform XML documents at (un)marshalling time if the document uses different standard versions. This is often done by defining a XSLT to be executed by the javax.xml.transform package. Following is an example of XSLT for changing the namespace.

      <?xml version="1.0" encoding="UTF-8"?>
      <xsl:stylesheet xmlns:xsl   = "http://www.w3.org/1999/XSL/Transform"
                      xmlns:xalan = "http://xml.apache.org/xslt"
                      xmlns:gml   = "http://www.opengis.net/gml/3.2"
                      version     = "1.0">
        <xsl:output method="xml" indent="yes" xalan:indent-amount="2"/>
      
        <!-- Identity copy. -->
        <xsl:template match="node()|@*">
          <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
          </xsl:copy>
        </xsl:template>
      
        <!-- Change elements namespace. -->
        <xsl:template match="gml:*">
           <xsl:element name='gml2:{local-name()}' namespace='http://www.opengis.net/gml'>
            <xsl:apply-templates select="node()|@*"/>
          </xsl:element>
        </xsl:template>
      </xsl:stylesheet>
      

      However javax.xml.transform is heavy and perform more transformations than desired. For example applying the above XSLT causes additional xmlns attributes to appear in all elements under the root.

      Apache SIS 0.4 in its org.apache.sis.xml package takes a lighter approach, based on javax.xml.stream.XMLStreamReader and XMLStreamWriter custom implementations used as "micro-transformers". This works for simple changes and reduce undesirable transformations.

      However if future SIS versions need to handle more complicated changes, we may revisit this choice and uses XSLT. The purpose of this JIRA task is to remember that switching to XSLT may be something to consider. This is not something to do now - we are waiting for more experience before to determine is XSLT is appropriate, given its cost and the output result.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                desruisseaux Martin Desruisseaux
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: