Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.10
-
None
-
None
-
FOP 2.10
Saxon 12.5 HE
OpenJDK 21
Fedora 41
Description
FOP is generating an illegal XMP stream, when Saxon is in the classpath resulting in invalid PDF/A3 files. I originally ran into this problem in context of Apache Camel, but finally traced the issue back to FOP. Here is a reproducible test procedure.
Please find the xconf and the static FO attached. I use the FOP binary package from https://www.apache.org/dyn/closer.cgi?filename=/xmlgraphics/fop/binaries/fop-2.10-bin.tar.gz&action=download, the current Saxon 12.5 HE and OpenJDK 21 on Fedora 41.
Create PDF without Saxon:
$ fop-2.10/fop/fop -c PDFA3Xmp.xconf -fo PDFXMP.fo -pdf output-ok.pdf
Create PDF with Saxon:
$ export CLASSPATH=SaxonHE12-5J/saxon-he-12.5.jar:SaxonHE12-5J/lib/xmlresolver-5.2.2.jar:SaxonHE12-5J/lib/xmlresolver-5.2.2-data.jar:SaxonHE12-5J/lib/jline-2.14.6.jar $ fop-2.10/fop/fop -c PDFA3Xmp.xconf -fo PDFXMP.fo -pdf output-saxon.pdf
Here the result without Saxon:
$ pdfinfo -meta output-ok.pdf
... <rdf:Description xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/" rdf:about=""> <pdfaExtension:schemas> <rdf:Bag> <rdf:li rdf:parseType="Resource"> <pdfaSchema:property xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#"> <rdf:Seq> <rdf:li rdf:parseType="Resource"> <pdfaProperty:name xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#">split</pdfaProperty:name> </rdf:li> </rdf:Seq> </pdfaSchema:property> </rdf:li> </rdf:Bag> </pdfaExtension:schemas> </rdf:Description> ...
Here the result with Saxon:
$ pdfinfo -meta output-saxon.pdf
... <rdf:RDF xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/" rdf:about=""> <pdfaExtension:schemas> <rdf:Bag> <rdf:li rdf:parseType="Resource"> <pdfaSchema:property> <rdf:Seq> <rdf:li rdf:parseType="Resource"> <pdfaProperty:name>split</pdfaProperty:name> </rdf:li> </rdf:Seq> </pdfaSchema:property> </rdf:li> </rdf:Bag> </pdfaExtension:schemas> </rdf:RDF> ...
Not only the namespace attributes xmlns:pdfaSchema and xmlns:pdfaProperty are missing, but also the rdf:Description element is now called rdf:RDF (?!)
Expected behavior: the XMP stream should be technically equal, regardless if Saxon or only the system's default XSLT transformer is available. In particular, the result must be well-formed XML, including all used namespace attributes.
Possible workaround: The JVM's default XSLT transformer can be set back to the system default with a Java command line parameter:
-Djavax.xml.transform.TransformerFactory=com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl"
Caveat: when Saxon is in the classpath there is probably a reason for this and other parts of the application might expect Saxon to be the default. These would now have to explicitly instantiate the Saxon transformer factory.