Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-2897

Preflight not flagging bad xml generated by XMPBox for dc:title

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 2.0.0
    • Fix Version/s: 3.0.0 PDFBox
    • Component/s: Preflight, XmpBox
    • Labels:
      None

      Description

      Tilman Hausherr asked that I open two separate issues for the finding in TIKA-1678 that XMPBox is not generating a valid dc:title entry in the XMP. This issue is meant to track preflight's failure to detect this problem.

      What PDFBox does:

            <dc:title>
              <rdf:Alt>
                <dc:li>this is the title</dc:li>
              </rdf:Alt>
            </dc:title>
      

      It should be:

                <dc:title>
                  <rdf:Alt>
                    <rdf:li xml:lang="x-default">this is the title</rdf:li>
                  </rdf:Alt>
                </dc:title>
      

      Error message from the PDF-Tools validator:

      'dc:li' is not allowed in arrays. The elements must be rdf:li or rdf:_N, where N is a positive number.
      There is only one RDF resource allowed in XMP.

        Attachments

        1. PDFBOX-2897-PDFA-BadXMP.pdf
          16 kB
          Tilman Hausherr
        2. PDFBOX-2897-PDFA-BadXMP2.pdf
          16 kB
          Tilman Hausherr

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                tallison@mitre.org Tim Allison
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: