Description
tilman asked that I open two separate issues for the finding in TIKA-1678 that XMPBox is not generating a valid dc:title entry in the XMP. This issue is meant to track preflight's failure to detect this problem.
What PDFBox does:
<dc:title>
<rdf:Alt>
<dc:li>this is the title</dc:li>
</rdf:Alt>
</dc:title>
It should be:
<dc:title> <rdf:Alt> <rdf:li xml:lang="x-default">this is the title</rdf:li> </rdf:Alt> </dc:title>
Error message from the PDF-Tools validator:
'dc:li' is not allowed in arrays. The elements must be rdf:li or rdf:_N, where N is a positive number.
There is only one RDF resource allowed in XMP.
Attachments
Attachments
Issue Links
- is related to
-
TIKA-1678 PDF metadata extraction fails to spot UTF-16 encoded title
- Resolved
-
PDFBOX-2896 XMPBox not creating valid "title" entry in DublinCoreSchema in trunk
- Closed