Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-2897

Preflight not flagging bad xml generated by XMPBox for dc:title

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 2.0.0
    • None
    • Preflight, XmpBox
    • None

    Description

      tilman asked that I open two separate issues for the finding in TIKA-1678 that XMPBox is not generating a valid dc:title entry in the XMP. This issue is meant to track preflight's failure to detect this problem.

      What PDFBox does:

            <dc:title>
              <rdf:Alt>
                <dc:li>this is the title</dc:li>
              </rdf:Alt>
            </dc:title>
      

      It should be:

                <dc:title>
                  <rdf:Alt>
                    <rdf:li xml:lang="x-default">this is the title</rdf:li>
                  </rdf:Alt>
                </dc:title>
      

      Error message from the PDF-Tools validator:

      'dc:li' is not allowed in arrays. The elements must be rdf:li or rdf:_N, where N is a positive number.
      There is only one RDF resource allowed in XMP.

      Attachments

        1. PDFBOX-2897-PDFA-BadXMP.pdf
          16 kB
          Tilman Hausherr
        2. PDFBOX-2897-PDFA-BadXMP2.pdf
          16 kB
          Tilman Hausherr

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tallison Tim Allison
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: