Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-3968

IllegalArgumentException: root cannot be nul

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Cannot Reproduce
    • 2.0.7
    • None
    • PDModel
    • None

    Description

      I got the exception to extract HTML from PDF file:

      org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.pdf.PDFParser@7ca231e4
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282)
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
      	...
      Caused by: java.lang.IllegalArgumentException: root cannot be null
      	at org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
      	at org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
      	at org.apache.pdfbox.pdmodel.PDDocument.getNumberOfPages(PDDocument.java:1398)
      	at org.apache.tika.parser.pdf.PDFParser.extractMetadata(PDFParser.java:243)
      	at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:154)
      	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
      	... 25 more
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            Giorgy Jorge Spinsanti
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: