Uploaded image for project: 'FOP'
  1. FOP
  2. FOP-3217

Invalid XMP stream with Saxon on classpath

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.10
    • None
    • renderer/pdf
    • None
    • FOP 2.10
      Saxon 12.5 HE
      OpenJDK 21
      Fedora 41

    Description

      FOP is generating an illegal XMP stream, when Saxon is in the classpath resulting in invalid PDF/A3 files. I originally ran into this problem in context of Apache Camel, but finally traced the issue back to FOP.  Here is a reproducible test procedure.

      Please find the xconf and the static FO attached. I use the FOP binary package from https://www.apache.org/dyn/closer.cgi?filename=/xmlgraphics/fop/binaries/fop-2.10-bin.tar.gz&action=download, the current Saxon 12.5 HE and OpenJDK 21 on Fedora 41.

      Create PDF without Saxon:

      $ fop-2.10/fop/fop -c PDFA3Xmp.xconf -fo PDFXMP.fo -pdf output-ok.pdf
      

      Create PDF with Saxon:

      $ export CLASSPATH=SaxonHE12-5J/saxon-he-12.5.jar:SaxonHE12-5J/lib/xmlresolver-5.2.2.jar:SaxonHE12-5J/lib/xmlresolver-5.2.2-data.jar:SaxonHE12-5J/lib/jline-2.14.6.jar
      $ fop-2.10/fop/fop -c PDFA3Xmp.xconf -fo PDFXMP.fo -pdf output-saxon.pdf
      

      Here the result without Saxon:

      $ pdfinfo -meta output-ok.pdf
      
      ...
             <rdf:Description xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/" rdf:about="">
                 <pdfaExtension:schemas>
                     <rdf:Bag>
                         <rdf:li rdf:parseType="Resource">
                             <pdfaSchema:property xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#">
                                 <rdf:Seq>
                                     <rdf:li rdf:parseType="Resource">
                                         <pdfaProperty:name xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#">split</pdfaProperty:name>
                                     </rdf:li>
                                 </rdf:Seq>
                             </pdfaSchema:property>
                         </rdf:li>
                     </rdf:Bag>
                 </pdfaExtension:schemas>
             </rdf:Description>
      ...
      

       

      Here the result with Saxon:

      $ pdfinfo -meta output-saxon.pdf
      
      ...
           <rdf:RDF xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/" rdf:about="">
              <pdfaExtension:schemas>
                 <rdf:Bag>
                    <rdf:li rdf:parseType="Resource">
                       <pdfaSchema:property>
                          <rdf:Seq>
                             <rdf:li rdf:parseType="Resource">
                                <pdfaProperty:name>split</pdfaProperty:name>
                             </rdf:li>
                          </rdf:Seq>
                       </pdfaSchema:property>
                    </rdf:li>
                 </rdf:Bag>
              </pdfaExtension:schemas>
           </rdf:RDF>
      ...
      

      Not only the namespace attributes xmlns:pdfaSchema and xmlns:pdfaProperty are missing, but also the rdf:Description element is now called rdf:RDF (?!)

       

      Expected behavior: the XMP stream should be technically equal, regardless if Saxon or only the system's default XSLT transformer is available. In particular, the result must be well-formed XML, including all used namespace attributes.

      Possible workaround: The JVM's default XSLT transformer can be set back to the system default with a Java command line parameter:

      -Djavax.xml.transform.TransformerFactory=com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl"
      

      Caveat: when Saxon is in the classpath there is probably a reason for this and other parts of the application might expect Saxon to be the default. These would now have to explicitly instantiate the Saxon transformer factory.

      Attachments

        1. PDFA3Xmp.xconf
          0.4 kB
          Jörn Willhöft
        2. PDFXMP.fo
          1 kB
          Jörn Willhöft

        Activity

          People

            Unassigned Unassigned
            jwillhoeft Jörn Willhöft
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: