Uploaded image for project: 'Xerces2-J'
  1. Xerces2-J
  2. XERCESJ-1537

Problem with unparsed entity location when indirect referenced in the DTDs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.9.1
    • None
    • DTD
    • None
    • All

    Description

      We have an XML file with the content:

      <?xml version="1.0" encoding="UTF-8"?>
      <!DOCTYPE doc SYSTEM "../dtd/entityProblem.dtd">
      <doc>
      <fig image="test"/>
      </doc>

      The main DTD "entityProblem.dtd" (located in the right folder) has the following content:

      <!ELEMENT doc ( fig )>
      <!ELEMENT fig EMPTY>
      <!ATTLIST fig image ENTITY #REQUIRED>

      <!ENTITY % entityProblem SYSTEM "../source/entityProblem.ent">
      %entityProblem;

      and the included DTD "entityProblem.ent" which is included in a relative folder has the content:

      <?xml version="1.0" encoding="UTF-8"?>
      <!NOTATION gif SYSTEM "gif">
      <!ENTITY % test '<!ENTITY test SYSTEM "images/crane.gif" NDATA gif>'>
      %test;

      If we transform the XML with an XSLT processor which uses Xerces for parsing (like Saxon) with the content like:

      <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      version="1.0">
      <xsl:template match="/">
      <xsl:text>
      </xsl:text>
      <xsl:value-of select="unparsed-entity-uri(/doc/fig/@image)"/>
      <xsl:text>
      </xsl:text>
      </xsl:template>
      </xsl:stylesheet>

      Then the unparsed entity location will be resolved relative to the current file directory (new File(".")) instead of resolving it relative to the DTD where it was declared.

      A possible solution is to make modifications in the org.apache.xerces.impl.XMLEntityManager on the "org.apache.xerces.impl.XMLEntityManager.startEntity(String, boolean)" method and in the case of InternalEntity, instead of creating for it an XMLInputSource like:

      xmlInputSource = new XMLInputSource(null, null, null, reader, null);

      you could set a system ID to the input source like:

      xmlInputSource = new XMLInputSource(null, fCurrentEntity != null ? fCurrentEntity.getExpandedSystemId() : null, null, reader, null);

      We implemented this solution in Oxygen XML Editor as a patch but there are still problems, with this solution in place the image system ID is expanded relative to "entityProblem.dtd" and not relative to "entityProblem.ent".
      Using MSXML.NET the image location is correctly solved relative to "entityProblem.ent"

      Attachments

        1. uri-problem.zip
          3 kB
          Radu Coravu

        Activity

          People

            mrglavas@ca.ibm.com Michael Glavassevich
            radu_coravu Radu Coravu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 4h
                4h
                Remaining:
                Remaining Estimate - 4h
                4h
                Logged:
                Time Spent - Not Specified
                Not Specified