Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-1266

When I try to convert certain pages of certain PDF to images I am getting error java.lang.ClassCastException: org.apache.pdfbox.cos.COSNull cannot be cast to org.apache.pdfbox.cos.COSDictionary

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.6.0
    • Fix Version/s: 1.7.0
    • Component/s: PDModel
    • Labels:
      None
    • Environment:
      java version "1.6.0_30"
      Java(TM) SE Runtime Environment (build 1.6.0_30-b12)
      Java HotSpot(TM) 64-Bit Server VM (build 20.5-b03, mixed mode)

      Fedora 16 (and also Debian Squeeze) not tested on other systems

      Description

      Unable to convert certain pages to images for certain PDF documents. Getting error: java.lang.ClassCastException: org.apache.pdfbox.cos.COSNull cannot be cast to org.apache.pdfbox.cos.COSDictionary

      method for converting page is called this way: BufferedImage image = page.convertToImage(BufferedImage.TYPE_3BYTE_BGR, 300); // where page is of type org.​apache.​pdfbox.​pdmodel.​PDPage

      Full stacktrace (of relevant part):
      java.lang.ClassCastException: org.apache.pdfbox.cos.COSNull cannot be cast to org.apache.pdfbox.cos.COSDictionary
      at org.apache.pdfbox.pdmodel.graphics.xobject.PDCcitt.getRGBImage(PDCcitt.java:119)
      at org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:78)
      at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:551)
      at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:274)
      at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251)
      at org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:130)
      at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:551)
      at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:274)
      at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251)
      at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225)
      at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:107)
      at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:722)
      at eu.eudml.enhancement.pdf2textviaocr.PdfImageExtractor.extractImagesUsingPdfParser(PdfImageExtractor.java:236)

        Attachments

        1. problematic.pdf
          895 kB
          Radim Hatlapatka

          Activity

            People

            • Assignee:
              lehmi Andreas Lehmkühler
              Reporter:
              nickradas Radim Hatlapatka
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: