Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4521

Missing Info value from file trailer: org.apache.pdfbox.cos.COSName cannot be cast to org.apache.pdfbox.cos.COSDictionary

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.15
    • 2.0.16, 3.0.0 PDFBox
    • Parsing
    • None

    Description

      The following exception

      Cause: java.lang.ClassCastException: org.apache.pdfbox.cos.COSName cannot be cast to org.apache.pdfbox.cos.COSDictionary at org.apache.pdfbox.pdmodel.PDDocument.getDocumentInformation(PDDocument.java:740) at org.apache.tika.parser.pdf.PDFParser.extractMetadata(PDFParser.java:242) at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:154) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)

      is generated by PDF documents that have no value in the file trailer for the Info key, eg:

      << /Size 50/Root 8 0 R/Info /ID >>
      

      According to the PDF spec the Info key is optional. PDFBox correctly handles the case when there is no Info key and no value is present, but in this case, the key is present but without a value.

      Attachments

        1. Editathon_cheat_sheet_(EN).pdf
          164 kB
          Oliver Mannion
        2. Editathon_cheat_sheet_(EN)_MetaDefender.pdf
          162 kB
          Oliver Mannion

        Activity

          People

            tilman Tilman Hausherr
            oliman Oliver Mannion
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified